I'd leave most of the entity actions up to several functions:
define move(direction dir) { if(dir == LEFT) then x--; ... }
define jump(velocity v) { ... }
the same with GUI functions:
define promptTextBox(String message, int scrollSpeed) { ... }
define promotTextBoxQuestion(String message, String[] questions, int scrollSpeed) { .... }
Then use some form of executing those functions. An easier way (in the long run) would to create your own small form of scripting:
[Somewhere in cutscene1.txt]:
Player: Move LEFT > do 3 times
Player: Jump
Wait for Value < Player: GetIsStanding
GUI: Prompt "You're the chosen one!" 20
You'd have to register these values in your code like so:
ScriptHandler.registerCallback("Player", playerObject);
ScriptHandler.getCallback("Player").setCommand("Move", 1, Player.move); // where 1 is the amount of parameters.
Creating your own scripting could get complicated. It might be easier to use existing scripting languages like LUA.
It really depends how complex you want things.
For my cutscene in Zed and Ginger, I mostly used programmatic movement of the normal game objects, but I played all that movement and recorded the mouse position, each frame, for the movement of Zed as an energy cloud. Doing it that way made synchronising the complex movements significantly simpler.
Since you are using Ruby (or, at least, you have used Ruby in stuff before), you might have all the movement info in a YAML file, which you could read in and convert into actual movements of each sprite (just need to have move_x, move_y, angle, object_id, etc in the file in a big array of arrays).
Beyond that, you'd really need to know what you wanted to achieve before more advice could be forthcoming :)