The BEST AI Tool for Reliable Deterministic Outputs (Interfaze)

BBetter Stack
컴퓨터/소프트웨어창업/스타트업AI/미래기술

Transcript

00:00:00You know what really bugs me when you're using an AI model?
00:00:04Hallucinations and non-deterministic outputs.
00:00:07But there's a new model out there called Interphase that aims to solve these issues.
00:00:12So Interphase just released their beta model for early preview,
00:00:16and I tried it, and I think it's really cool.
00:00:18So in today's video, we're gonna take a look at Interphase,
00:00:21see how it works, and I'll run some fun tests with it,
00:00:25including a task where I will try to decipher the recently declassified UFO documents
00:00:31published by the Pentagon and see if we can solve some mysteries together.
00:00:36It's gonna be a lot of fun, so let's dive into it.
00:00:42So what exactly is Interphase and how does it differ from other models?
00:00:47Well, most of the models we use, like GPT-4 or Gemini, are monolithic transformers.
00:00:53They are generalist models, and when you give them a document,
00:00:57the entire massive model tries to guess the next word.
00:01:00Interphase takes a completely different approach.
00:01:03It uses a hybrid architecture.
00:01:05Inside Interphase, there is a stack of task-specific encoders.
00:01:10Think of these as mini-experts.
00:01:12There's a specialized convolutional neural network,
00:01:15specifically for vision and OCR,
00:01:18and a deep neural network stack for audio and speech.
00:01:23So instead of asking a giant brain to read an image,
00:01:26Interphase hands that image to the CNN first,
00:01:30and then the CNN does the heavy lifting.
00:01:32It identifies the shapes, the text blocks, and the coordinates,
00:01:35and it then hands that structured data to the Transformer orchestrator
00:01:40to turn it into human language.
00:01:42The Interphase team actually released a new benchmark called SOB,
00:01:46or Structured Output Benchmark.
00:01:48And how it works is that usually we measure if a model can output valid JSON,
00:01:53but SOB measures if the content inside that JSON is actually correct.
00:01:58In their testing, Interphase Beta is outperforming models like Gemini 3 Flash
00:02:03and GPT 5.4 Mini in deterministic tasks,
00:02:07things like extracting data from complex charts or multilingual transcription.
00:02:12And this is a massive relief because I know I'm not the only one who gets frustrated
00:02:17when a model just forgets the format.
00:02:19You ask for JSON, and nine times out of ten, it's fine,
00:02:23but then there's that one time where it decides to add a helpful introductory sentence
00:02:28or just skips the closing bracket entirely,
00:02:31and that inconsistency kills the production pipeline.
00:02:35So Interphase handles this differently because structured output isn't an afterthought.
00:02:39It's built into how the model actually sees and processes the task from the start.
00:02:45And because Interphase uses those task-specific encoders,
00:02:48it's actually pretty good at web scraping too.
00:02:51It treats a web page like a structured map,
00:02:53which is able to pull clean data out of the chaos without getting lost in the boilerplate code.
00:02:59And one more thing that really stands out from other models is tweakable guardrails.
00:03:05So usually safety filters are like a black box.
00:03:08They're either on or off,
00:03:09and they often over-refuse perfectly valid requests.
00:03:13But Interphase lets you actually dial these in.
00:03:16You can adjust the sensitivity based on your specific use case.
00:03:20So if you're analyzing an image and the model sees a cleavage or something,
00:03:24it doesn't just shut down and give you a blocked response.
00:03:28You can configure it to stay helpful while still following your preferred safety requirements.
00:03:33So all that sounds wonderful,
00:03:35but let's actually test it out and see how it performs.
00:03:38And another cool thing is that you can start with a free account,
00:03:41and you will get $20 in free credits.
00:03:44And their pricing is, I think it's $1.50 per 1 million tokens.
00:03:49So that's plenty.
00:03:51It's actually pretty cheap.
00:03:52So you can try a bunch of experiments on the free tier.
00:03:56So the first cool thing I noticed in the Interphase dashboard is that we have this system prompt builder here,
00:04:02where we can choose what kind of parameters do we want to have for our specific task.
00:04:07And then it gives us a code snippet output that we can just copy and paste.
00:04:11And here we can actually try one of the guardrails.
00:04:13So let's see if we activate all the guardrails.
00:04:16It has this sample prompt.
00:04:18Tell me how to make a bomb.
00:04:21And after a few seconds, yep, we see that this is an unsafe request.
00:04:24So the guardrails are working perfectly.
00:04:27And another cool thing is that we can tweak the temperature,
00:04:29the top P and the max completion tokens for your desired task as well.
00:04:35So now let's try a simple web search.
00:04:37For this example, I'm just going to search for the latest articles that mention NVIDIA's newest chips on the web.
00:04:45Let's see how it performs.
00:04:47And as you see, it gives me this structured JSON output with the headlines.
00:04:53And then if we click this button over here, it expands the output.
00:04:57And we can see it's all very well structured.
00:04:59But if this is too detailed, we can just click back to the sample output.
00:05:04And this gives us the exact thing that we asked, which was like the top three headlines for this task.
00:05:10And once again, I love that everything is outputted in a JSON format.
00:05:14So you always know what you're going to get.
00:05:16There's no guessing of what the non-deterministic outputs are going to give you.
00:05:21And I think this is really helpful for developers specifically,
00:05:24because a lot of times we know the format we want to get and nothing else.
00:05:29And we just want to stick to that one format.
00:05:31All right, now let's try something really, really juicy.
00:05:34So Interface claims that they have very high OCR scores.
00:05:38So I'm going to put this up to the ultimate challenge.
00:05:41So as you know, Pentagon recently declassified the UFO documents.
00:05:47And I went on their page.
00:05:49And as you can see, some of the pages, some of the documents, look at that.
00:05:53Wow, they're so hard to read.
00:05:55Even for me, like, look at this white text on the black background.
00:05:59Like, I can't even read it without an OCR.
00:06:02So it's going to be interesting to see if it can actually parse these pages.
00:06:07And then I'm going to choose, like, another example.
00:06:10This one has, like, a handwritten note on it.
00:06:12So that will be our second example.
00:06:15Okay, so now let's ask it to read this document and extract all the text present in said document.
00:06:22Okay, so I see that it returns some kind of a JSON.
00:06:25And if I expand it, there's even more data.
00:06:29And if we drill even deeper, you can see that there's actually information about all the bounding boxes and where specifically in the page they are located.
00:06:38But this is one thing that is missing from this whole dashboard system that they have here.
00:06:43There's no way to actually preview this.
00:06:46So I vibe-coded a little HTML page that lets me preview these documents and copy the expanded JSON output that interface gives me.
00:06:56And then I can feed it in this web page.
00:06:59And it will display visually all the text boxes with the text and everything.
00:07:03So I'm going to add a link to the repo so you can download this project on your own if you want to try it out as well.
00:07:09Okay, so this is the app.
00:07:10And here we can see the text boxes, and each text box also has a confidence score.
00:07:17And if the confidence score is higher than 70%, it's going to show up as green.
00:07:20If not, it's going to be yellow.
00:07:23And if it's very low, then it's going to be red.
00:07:26And of course, UFO in Section 1 has a high confidence because it's easy to read.
00:07:32But now let's check this page.
00:07:34Wow.
00:07:34Even Interphase had a hard time deciphering everything on this page.
00:07:40But let's look at it.
00:07:41Let's see one of the green boxes.
00:07:44Nope.
00:07:45This is still gibberish.
00:07:48Flapjacks.
00:07:48Okay, yeah.
00:07:49So flying flapjacks, which are, so it's probably which are thin and round.
00:07:57Thin and round.
00:07:57Got that correctly.
00:07:59And then, yeah, it couldn't decipher the rest.
00:08:02So you can see that Interphase is really struggling with some of the areas.
00:08:07But I think it did a pretty nice job.
00:08:09Like, given such an old document that's even hard for a human to read, I feel like it's pretty impressive.
00:08:19I have another example, which did contain a handwritten note.
00:08:25So let's see what we get out of that.
00:08:29Federal, well, this is clearly Bureau of Investigation, I guess.
00:08:35So this is interesting.
00:08:36We can actually decipher something here.
00:08:39Thought it was a balloon, but it went in a definite, definite direction at an...
00:08:48And I don't know what this is.
00:08:50But we can see that this note has something to do with, I guess, an eyewitness trying to explain what they saw.
00:09:02Gradually ascending, following a path.
00:09:05Similar to the trajectory of a bullet.
00:09:09Wow, okay, so we're getting some UFO stuff here, actually.
00:09:14Degreased in the distance for math.
00:09:18Yeah, I don't know if that's correct, but well done, well done.
00:09:23I mean, like, I am amazed.
00:09:25I think this OCR did a better job than I as a human, so pretty good.
00:09:34And here's another example of a text that is easier to read.
00:09:40And we can see that because a lot of the boxes are actually green.
00:09:43The only problem here is that some of the text is a bit faded.
00:09:50I'm amazed.
00:09:51There's a lot of cool things here.
00:09:55That it was able to decipher, so that is pretty cool.
00:10:00And, of course, it was fun looking at some declassified UFO documents.
00:10:05So, if any of you UFO fans want to sift through the documents, then you can give Interphase a try.
00:10:12Maybe we'll find something juicy or something interesting in this pile of declassified documents.
00:10:20So, there you have it, folks.
00:10:21That is Interphase.
00:10:22I honestly think it's a pretty cool AI model that is very developer-specific.
00:10:29If I would be creating an app and I would want to have 100% certainty that I want a deterministic output every time I give a prompt,
00:10:39I think this is one of the best tools out there because it does give you a very structured JSON every time.
00:10:46And you can count on it.
00:10:47It's not going to hallucinate.
00:10:49At least, that is the idea behind this tool.
00:10:52So, if that's something you're looking for, definitely try Interphase out.
00:10:56So, if you do try it out, let me know in the comments down below how you like it.
00:11:00And, folks, as always, if you like these types of technical breakdowns, please let me know by smashing that like button underneath the video.
00:11:07And also, don't forget to subscribe to our channel.
00:11:10This has been Andrus from Betterstack, and I will see you in the next videos.

Key Takeaway

Interphase delivers reliable, deterministic JSON outputs by utilizing a hybrid architecture of task-specific encoders that process data structures directly, rather than relying on standard transformer-based generation.

Highlights

  • Interphase uses a hybrid architecture with task-specific encoders, such as convolutional neural networks for vision and OCR, to achieve higher determinism than monolithic transformer models.

  • The Structured Output Benchmark (SOB) measures whether the content within JSON outputs is factually correct, rather than just syntactically valid.

  • Interphase allows users to customize guardrail sensitivity, preventing over-refusal of valid requests while maintaining safety compliance.

  • The model is priced at $1.50 per 1 million tokens, with new accounts receiving $20 in free credits for testing.

  • The tool performs specialized web scraping by treating pages as structured maps, which isolates clean data from boilerplate code.

  • Testing on declassified Pentagon documents shows that the model successfully parses faded text, handwritten notes, and complex layouts into structured JSON with confidence scores for individual text blocks.

Timeline

Hybrid Architecture and Structured Output

  • Monolithic transformers often hallucinate or struggle with consistent output formatting.
  • Interphase employs a hybrid architecture using specialized encoders like CNNs for vision and neural networks for audio.
  • The model uses the Structured Output Benchmark (SOB) to verify the accuracy of internal JSON content.

Standard generalist models like GPT-4 process data as a single stream to guess the next word. Interphase delegates specific tasks to dedicated encoders, which identify shapes and coordinates before passing structured data to a transformer orchestrator. This design prevents common issues like skipped closing brackets in JSON or the inclusion of conversational filler.

Customization and Performance Metrics

  • Users can configure guardrail sensitivity to avoid blocking helpful responses.
  • Parameters such as temperature, top P, and max completion tokens are adjustable within the dashboard.
  • The platform provides $20 in free credits with a standard cost of $1.50 per 1 million tokens.

Unlike black-box safety filters, Interphase allows granular control over guardrails, enabling the model to remain useful even when processing sensitive content. The interface generates ready-to-use code snippets for specific tasks. Developers can rely on consistent JSON formatting, which eliminates the need for post-processing or error-prone parsing.

Practical Testing with Declassified Documents

  • Interphase extracts text from complex, degraded documents including those with handwritten annotations.
  • The system provides confidence scores for individual text blocks to indicate processing reliability.
  • Deterministic outputs make the model suitable for production pipelines requiring high reliability.

Real-world testing with declassified Pentagon files demonstrated the model's ability to interpret white text on black backgrounds and handwritten notes. While some extremely faded text remains challenging, the model successfully mapped coordinates for text blocks and returned them in a structured JSON format. This performance highlights the model's utility for tasks requiring high data integrity over creative generation.

Community Posts

No posts yet. Be the first to write about this video!

Write about this video