Transcript
00:00:00codex might have clod code beat here with the release of the brand new experimental goals feature
00:00:05codex is now the easiest way to execute long-running autonomous coding tasks without
00:00:10having to include any sort of additional orchestration layers goals acts like a more
00:00:15sophisticated integrated ralph loop you give it some sort of objective and it will work
00:00:19for potentially hours upon hours to solve that problem without you needing to intervene at all
00:00:25and today i'm going to show you how it works how you can set it up and we'll go through a real
00:00:29demo so you can see this thing in action so today we'll be creating rift salvage our 2d combat video
00:00:35game that uses completely original assets and that we build strictly through goals the goals feature
00:00:42is one of the real differentiators with codex right now and it is hilariously simple to use
00:00:47we're talking about a single slash command so there's a ton of value to be had here so whether
00:00:51you're using the codex desktop app or the codex cli you have to enable goals because it's an
00:00:56experimental feature now you can prompt codex to do that or you can do it yourself very quickly
00:01:01inside of the codex app i'm just going to go to settings and then i am going to go to configuration
00:01:07right here where it says open config.toml i'm going to click that i'm going to open it up in vs code
00:01:15and down here you need to add two lines if it's not already there features and then goals equal true
00:01:22that's it should take you like two seconds if that's too complicated you can also tell codex
00:01:27hey can you enable goals for me so features goals equals true that's it now to actually use goals
00:01:35inside of the desktop app and inside the cli you're just gonna do forward slash goal now for whatever
00:01:40reason i think it's because it's new and experimental when you do forward slash goal you're not going to
00:01:43get any like notification that it's actually working and you'll see once we give it a proper prompt
00:01:48that we will actually get a little badge that we know goal is working so if you enable it make sure
00:01:53you reset clod code after you do that just to make sure the changes hit but when you do forward slash
00:01:58goal you're not going to see anything like you normally would like if you did you know a skill or
00:02:02something where you get like some proper you know feedback that it's working but this is good but
00:02:08before we actually demo goal inside of the app let me explain how it actually is working under the hood
00:02:13but first a quick word from today's sponsor me so as you know inside of chase ai plus i have the
00:02:18clod code master class but i also just released the codex master class so you now have two tools that
00:02:24can help bring you from zero to ai dev and this is the best place to learn how to do that because i
00:02:29assume you have no technical knowledge and we focus on real use cases so if you want to get your hands
00:02:34on this or if you want to listen to my free webinar that i'm running in a couple days the link will be
00:02:40down in the pinned comment hope to see you there so like i said in the intro codex goals is basically
00:02:46a more sophisticated integrated ralph loop now what is a ralph loop you asked well we'll do a
00:02:51quick review for those of you who don't remember at its core a ralph loop if we were using it in
00:02:57something like clod code is simply one line of code it's just a bash loop it's exactly what you see
00:03:03right here and the idea is i run this line of code and what's going to happen is it's going to spin up
00:03:09clod code or spin up codex or any ai system and it's going to take a look at a prompt.md file and
00:03:16this prompt is going to say hey here's what we're trying to do here's how i want to do it by the way
00:03:21here's the criteria that will consider it complete so in this example we want to lift coverage on
00:03:28authentication files which basically means we need to create more tests and we will stop when coverage
00:03:33is at 75 so that's the end goal and so the way it would work is you would start this loop and then
00:03:41the loop takes a look at the prompt it then injects that into the ai session the session runs a single
00:03:48turn it reads the prompt and it also reads a state.md file the state file is basically a file that it
00:03:56can take a look at saying okay if we have task one two and three what have we done so far and is it
00:04:03working so say the first few turns it completes task one and then the next turn it's going to go
00:04:10take a look at the state file and say hey task two isn't complete guess what we're going to do in this
00:04:14session or we're going to do session two and then maybe it doesn't work for the first turn it says
00:04:18hey here's what i tried next guy comes etc etc until it completes all the tasks and so after that agent
00:04:25runs its turn it updates the file the turn ends and the loop continues so you get this sort of
00:04:30like continual loop where it's constantly checking a couple different files to see what have we done
00:04:35what do we need to do what is the end state and eventually once it reaches the completion criteria
00:04:41it says hey we're done all autonomous that's the idea of Ralph loops now if you want Ralph loops to
00:04:48do more things it requires additional scaffolding you know things to do with like billing what do
00:04:53you do is there any sort of like smart token usage not necessarily what happens if it shuts down right
00:04:58the agent crashes you control c how does it know it's actually done is there actually like a built-in
00:05:02third party that verifies everything's done not really because at its core again it's just a single
00:05:08line of code now compare that to goals goals big picture works the same we're telling it to do
00:05:15something it has an idea of how it's going to do it and it's constantly updating internal files saying
00:05:19here's what I've done here's what we still need to do and it's trying to reach that end state so big
00:05:23picture it's pretty much the same however there's a few differences first of all we have these two
00:05:29markdown files which are essentially invisible to you it's continuation and budget limit what are
00:05:35these two things doing well these things allow codecs to act in a different manner if you're
00:05:40about to bump up against usage limits which is important so there's actually sort of a graceful
00:05:46ending for how your system will handle a task in a goals loop versus a ralph loop ralph loop you
00:05:52hit your budget you're done codecs not necessarily it will figure out a good way to sort of like
00:05:57get you to a spot that you can work on later and the way that happens in reality is codecs runs its
00:06:03turn in its goals loop or ralph loop however you want to think about it and when it reaches the end
00:06:08of turn it really has four paths it can go down one if it still has work to do and the budget is good
00:06:13hey we're just going to keep on trucking two if we are near our token cap what it's going to do is
00:06:19it's going to inject that budget limit.md file and it's going to essentially wrap up the turn gracefully
00:06:25and give you final report for what's been done and what you need to do moving forward if you update
00:06:29your limit if we have finished the project it's going to make an update goal tool call so it's
00:06:34going to head and change its status it's going to make sure all the deliverables are audited and if
00:06:39everything comes back thumbs up hey goal complete we're done lastly we have ways to pause the goal
00:06:45edit the goal deal with crashes so in the event something goes wrong while we're doing our loop
00:06:49well it's not like a traditional ralph loop where we're kind of just like boned so a little more
00:06:54sophisticated than ralph loop very similar to big picture and we don't have to do any additional
00:06:59orchestrations this whole thing should sound very familiar to you if you've ever worked with
00:07:05something like gsd gsd superpowers all these tools are orchestration layers that sit on clod code to
00:07:11essentially do what we're doing with a single slash command inside of codex with goals and because it's
00:07:18literally just a single slash command it makes it super easy to execute you don't need to watch a
00:07:2440-minute demo on all the intricacies of gsd you just kind of do forward slash go and codex goes
00:07:30forth and conquers and so with that in mind let's actually put it to the test so first of all we're
00:07:35going to put this guy in playing mode because we can go from playing mode to goals very easily
00:07:39and we're going to have it create essentially a top-down arcade survival game for us and we're
00:07:44going to have it create all of its own assets the cool thing about codex versus something like
00:07:49clod code for example is because it's an open ai product we have access to image to the gpt images
00:07:56too so it's going to create all of its own assets for this game i want to play your drone sprite i
00:08:01want three enemies i want a boss creature energy core hazard mine rip background badges to ui
00:08:07flavor asset so i'm going to have it create quite a bit okay so the prompt is relatively sophisticated
00:08:15because this can go on for a long long time like i should have shown you the screenshot already
00:08:18the guy who's like i'm having it run for 50 straight hours i mean who knows if 50 straight
00:08:23hours is is really the best way to do this but the idea is we have a fuzzy idea we go into plan mode
00:08:31we get something very very tight and very importantly with something like this is you
00:08:36need to be extremely specific about what the end result needs to be because if we don't have a very
00:08:43specific end result we are shooting for a very quantifiable set of things it must hit in order
00:08:50for it to complete the loop you're going to get an outcome that is kind of mediocre it might be
00:08:55half baked so i highly suggest you go through plan mode and you take the time to actually flesh out
00:09:02the plan and not say like slash goal make me a sas product that makes a billion dollars and so here's
00:09:07the plan for our game and when it comes to verification this is what it's going to be looking
00:09:12at right this is what it's actually going to test to before it says it's complete obviously it needs
00:09:17to run npm run build and fix all the errors start the dev server and provide the local url add and
00:09:24run an automated playwright verification script that opens the app confirms everything loads
00:09:29checks the canvas is non-blank simulates keyboard movements simulates collectible event forces damage
00:09:34confirms health changes boss win state uis on and on and on and on so this is what you really want
00:09:39to take a look at you know if you look at the verification and you say hey if all that is
00:09:44completed i will be happy well then you're good to move forward now when it says implement the plan
00:09:49you're going to want to go to no i'll tell you what to do you can do sword slash goal use goal
00:09:54to implement this plan and we're going to submit and so right up here what do you see you have this
00:10:02little badge that says goal so now i know we're doing goal and it says it right here as well so
00:10:09like i told you before when you do forward slash goal you're not going to get any commands but it's
00:10:12working i think it's just sort of a ui bug for it being an experimental feature so it says it's
00:10:17still in plan mode so we'll cancel that goal use goal to implement this plan so a little rough around
00:10:28the edges still but let's see what it actually does for us the idea is now i'm completely hands-off
00:10:34you know it's going to execute its little ralph loop its little goal thing and at the end we're
00:10:39going to have a final product so it's been working for about 12 minutes now and you can see it's
00:10:43already in the process of creating all the different assets using the image gen 2 model
00:10:49which is like pretty sweet and again the other nice thing is when you're using the desktop app versus
00:10:54just scraping in the raw terminal like all this is presented to you in line which is which is nice i
00:11:00personally have been very impressed with the codex desktop app um not to say i don't still love claud
00:11:06code i think i use both these tools interchangeably you can kind of watch my last video for my whole
00:11:11bit on that where i think the idea that we need to choose between these two tools is kind of stupid
00:11:15like why are we not just using both and often both of them in tandem um but with claud code i'm very
00:11:20much pure terminal but with codex i've really enjoyed the desktop app and part of that might
00:11:26just be it's a nice change of pace sometimes too versus always being in the the terminal all the
00:11:32time so so far i've really liked it so after about 30 minutes it said it was done and actually it
00:11:38finished it up faster than i thought it would so let's see how it did on the first pass and because
00:11:44it did this so quickly i'll probably ask it to do some stuff at the end so it says it implemented
00:11:49rift salvage local dev server is running here it's a canvas game with keyboard touch control spawning
00:11:56enemies mine scoring shield power ups boss phase win lose pause and restart 11 image gen bitmap
00:12:03assets with alpha cutouts automated playwright verifier and then shows us all the things it built
00:12:10which is pretty cool so let's see if it works and what we can add to kind of push it a little bit a
00:12:17little bit more oh let's actually do it in the real browser okay so i have a little loading screen
00:12:27and contrast is a little low kind of hard to see it might be kind of hard for you to see it but
00:12:32i have my little spaceship so that's a mine i think i'm supposed to like grab these things
00:12:39while it spawns enemies that chase me so you know it works it looks kind of cool i think we could
00:12:49probably work on the graphics a little bit but it is kind of neat that everything here was created
00:12:56like as unique images i think what we could do is we could add well first of all i want to see what
00:13:00the boss fight looks like if we could kind of speed that up and also add some sort of like
00:13:04shooting system either with like lasers or something cool like that so let's actually do
00:13:11that let's let's have it do that before we sit here any longer so i'm going to throw it in play mode
00:13:15and see if we can make it work a little bit harder okay so i think that was a pretty good first pass
00:13:19everything's working but i'd like to make it a bit more complicated can we add some sort of
00:13:24like combat system whether that's like lasers shooting at you know different enemies and they
00:13:31shoot back at us could we also have the boss phase come a little bit quicker or include them some sort
00:13:37of button i can press to just have the boss phase start could we also change the contrast a little
00:13:42bit because right now everything kind of blends into the background and if you have any other ideas
00:13:49to sort of just make this a little bit more complicated and push you to your limits
00:13:53let me see those ideas so this is the plan it came up with now one thing you want to know when you're
00:13:58using the goals system each goal run is tied to the thread or the session that you are using at that
00:14:07time so we've been in the same chat which means we're in the same goal thread if i want to do
00:14:12goals again i want to do a second goals run on the same project we can do that but we have to do it in
00:14:18a second thread or a second chat like like opening up another terminal so all i'm going to do is copy
00:14:24this plan i'm going to open up another chat and we're going to do slash goal and we're going to
00:14:33paste this in there so after 15 minutes we completed the second goal pass so it implemented the combat
00:14:40upgrade so let's see what this game looks like now so here's the loading screen again very similar
00:14:44to what we saw the first time except it added a few sort of widgets up top here so we have target
00:14:50combo as well as the boss signal now so if we launch it right away kind of shooting my
00:14:56shooting my gun the enemies are able to shoot back and they have sort of hit points i can also
00:15:01hit the boss signal so there is the boss um pretty sick looking actually i i think the coolest thing
00:15:09about this game and what it did was just all the unique assets right the fact that everything is in
00:15:13a is a original asset and that it did all this using the image gen 2 which i think was pretty
00:15:19sweet um and i know obviously this only took about 45 minutes total between the two runs and we saw
00:15:24some people doing runs for like three days from their screenshots but i think the the best part
00:15:30about this is is how simple it is to execute those goals and you know you kind of just give it a goal
00:15:36and it's going to go nuts assuming you have some sort of locked in did we win i don't know if we
00:15:43died or not but as i was saying the cool thing about this and about the goals in general is the
00:15:48idea that if you have a clear north star and you have clear criteria for what success looks like
00:15:54you can get a ton of out of this and this can kind of just run forever so instead of having to set up
00:15:59your own sort of ralph loop and your own scaffolding or using something outside as an orchestration
00:16:05layer like gsd or superpowers it's kind of just built in for you and like we've done with here
00:16:10you can add a lot of neat stuff that are harder to implement but you can inside of claud code like
00:16:15if we use claud code for this we could have definitely done this we would have just had
00:16:18to implement something like the higgsfield cli or the higgsfield mcp to do all that image generation
00:16:24for us rather than it being this one integrated holistic system so i hope you were able to get
00:16:31something out of this video and i highly suggest you check out codex guys i've really enjoyed the
00:16:35desktop app like i've been talking about before i think this coolest thing is really cool and again
00:16:40we could have done this in tandem with claud code as well we could have had the plan be created in
00:16:44claud code and then thrown it into codex for goals had you know claud code take a look at what work it
00:16:49did and kind of have this back and forth which is where i think you get the most value it's kind of
00:16:53like you know the whole being greater than the sum of its parts tight deal so as always let me know
00:17:02what you thought make sure to check out chase ai plus there is a link to that down in the pinned
00:17:07comment also running a webinar in a few days there'll be a link there as well so hope to see you there
00:17:12and other than that i'll see you around