jasonjmcghee 8 hours ago

I've had a pretty poor experience with Gemini.

I've had to convince it to do things it should just be able to do but thinks it can't for some reason. Like reading from a file outside of the project directory- it can do it fine, but refuses to unless you convince it that no it actually can.

Also has inserted "\n" instead of newlines on a number of occasions.

I'd argue these behaviors are much more important than being able to use interactive commands.

  • sunaookami 7 hours ago

    Gemini doesn't seem to be trained on tool use (which Claude is) so it quiet often thinks it can't do something it certainly can and does a lot of nonsense. For me it fails nearly everytime while it's trying to read project files because it uses relative paths instead of absolute so I've put "For your "ReadFile" and "WriteFile" tool, you MUST use absolute paths to files" in my system instructions.

    Speaking of system instructions, Gemini always forgets them or doesn't follow them. And it still puts code comments nearly everywhere, it drives me nuts.

    Codex is much better at following system instructions but the CLI is..... very bad.

    • KronisLV 4 hours ago

      My experience with Gemini 2.5 Pro has oddly been better, maybe because I use RooCode/Cline? It was oddly apologetic, though, wasting tokens on lamenting its failure when it fails to do something and whatnot, instead of just getting on with the solution.

      At the same time, even the big versions of Qwen3 Coder (480B) regularly mess up file paths and use the wrong path separators, leading to files like srccomponentsMyComponent.vue from being created instead of src/components/MyComponent.vue.

      > And it still puts code comments nearly everywhere, it drives me nuts.

      I’ve had the issue of various models sometimes inserting comments like “// removed Foo” when it makes no sense to indicate the absence of something that’s not necessary there for a code block that isn’t there.

      At the same time, sometimes the LLMs love to eat my comments when doing changes and leave behind only the code.

      How silly (and annoying). It’s good to be able to try out multiple models with the exact same prompts though, maybe I should create my own custom mode for RooCode with all of the important stuff I want baked in.

    • theshrike79 6 hours ago

      Codex doesn’t give feedback while it’s running. It just works quietly in a way that’s not easy to interrupt if you could see it going off the rails.

      Claude is better at this.

    • lxgr 6 hours ago

      Gemini seems to have a poor model of both what it can and what it is allowed to do.

      I’ve noticed the latter with several image generation refusals I could eventually easily talk them out of (usually by mentioning fair use in a copyright/trademark context).

  • kaycey2022 2 hours ago

    Gemini CLI is definitely a much worse client than some of the other agent clients like opencode, cursor etc. But from my experience, that isn't because of the model quality. I get better quality responses from the gemini web chat interface than chatgpt, claude etc.

    Of course my experience is anecdotal, but we hardly have any decent benchmarks to compare these models. I suspect most benchmarks have leaked into training sets, rendering them useless anyway.

    • OJFord 2 hours ago

      Also people don't talk enough about (or are bad at separating themselves) the model vs. the client tool - e.g. from your comment maybe using codex/Claude Code/aider with Gemini API would be better, best even, but people rarely make that comparison or separation, it's always 'Claude Code with Claude vs. codex with GPT-x' etc.

  • ChrisGreenHeur 7 hours ago

    I have had these exact issues a lot with codex (gpt-5-codex)

  • Liebmann5 8 hours ago

    I second this man’s take. I’ve been using it consistently for a few months to give it a try and is definitely subpar. It can give really good answers at times however isn’t worth the time, energy, or luck to get it there.

orliesaurus 8 hours ago

As someone who has been experimenting with AI ‑powered command‑line helpers, I think adding interactive commands to the Gemini CLI is a logical step, but it won’t be useful unless the underlying model is reliable for basic tasks. Several people here noted that Gemini sometimes refuses to read files outside the project directory or mishandles newlines; those sorts of inconsistencies undermine trust.

In a world where you have 100 options, trust is of utmost importance. The CLI’s integration with node‑pty and the ability to stream pseudo‑tty output into mini‑terminal viewports is clever, and I’d love to see that layer documented or open‑sourced so other tools can build on it. I see this feature as something you’d use for short‑lived tasks like running a quick script, checking a log, or doing a one‑off database query. For longer editing sessions I’d still use a real terminal multiplexer and editor. If Google can fix the reliability issues and make the API for interactive sessions open, that would be hella good for everyone!

amitav1 9 hours ago

I think that this feature might have taken Gemini CLI from just Temu Claude Code with higher usage limits, to actually competitive as a tool. It'll be interesting to see how well this actually works in practice.

  • adastra22 7 hours ago

    Idk, the more skilled I get with Claude Code, the less I use interactive workflows.

    • anp 7 hours ago

      I tend to agree but there are a few scenarios where I really want it to work. Debuggers in particular seem hard to get right for the current agents. I’ve not been able to get the various MCP servers I’ve tried to work, I’ve struck out using the debug adapter protocol from agent-authored python. The best results I’ve gotten are from prompting it to run the debugger under screen, but it takes many tool calls to iterate IME. I’m curious to see how gemini cli works for that use case with this feature.

      • rockwotj 6 hours ago

        I would love to use gdb through an agent instead of directly. I spend so much time looking up commands and I sometimes skip things because I get impatient stepping over the next thing

heeen2 3 hours ago

I made a mcp that would use a pty lib to allow claude to debug a TUI app I was writing with ok-ish results. ultimately I wanted to see what was happening myself so when I need interactive I just tell it to use tmux-cli to capture the neighboring pane. https://github.com/pchalasani/claude-code-tools/blob/main/do... maybe turning that into a mcp with more guardrails and integrated guide to the agent would make it more popwerful

  • michaelbuckbee 2 hours ago

    I'm not actually sure about that (that turning it into an MCP would help). I've seen more momentum building around having better cli tool integration with ClaudeCode than MCP reliance.

caymanjim 7 hours ago

The best thing about this is that now Claude and Codex have to add it.

  • theshrike79 6 hours ago

    I’m still waiting for Gemini to add hooks and sub-agents

dustypotato 4 hours ago

> It's not just a stream of text; it's a live feed.

LLM wrote this article it seems.

For me Gemini CLI is not as good as Claude Code and sometimes writes more code than necessary and makes it hard to maintain. but hope it gets there with gemini 3.0 release. It's open source so I can imagine it getting there faster with community contributions.

nacs 9 hours ago

It's nice that they mention node-pty that does most of the heavy lifting for the terminal/pseudo-tty that powers this (VSCode's terminal emulator is powered by the same library).

It looks like they've added a layer on top of node-pty to allow serializing/streaming of the contents to the terminal within the mini-terminal viewports they're allocating for the terminal rendering. I wonder if they're releasing that portion as open source?

snthpy 5 hours ago

I couldn't tell from the post how this will affect Gemini's ability to assist better as a result.

I guess for Google this will be a treasure trove of real developer interactions to train on.

I might try this once Gemini 3 comes out. Until then, if you're running tmux or zellij, this seems like a worse user experience since you're in a subwindow and have less screen real estate to work with.

jgoodhcg 9 hours ago

Popping into nvim to check on something really quick seems immediately useful. I think I'll still want a dedicated tab or different terminal app to have my longer lived editor open but this might be nice for validating output with test runners or checking on a database entry in psql or something.

  • nacs 9 hours ago

    I'm not sure how usable neovim will be in what looks to be a 6 line high window as they show in the demo video.

  • mark_l_watson 8 hours ago

    I had problems running Emacs in no window mode (emacs -nw). I should try again, or maybe just use vim.

arjie 5 hours ago

It was very buggy for me. You kind of have to coax it into interactive use and then some of the time it got stuck pondering once I exited the app flow and returned to the Gemini CLI (not with Ctrl-F, full exit, it closes the TUI window). It's also super laggy.

To be honest, at this point having Claude Code monitor the output of a `tmux pipe-pane` is probably going to be superior.

efskap 9 hours ago

To be clear, the LLM is only aware of the final state of the ptty when the command exits, right? It's not a TUI computer-use model at this point from what I can tell.

lazarie 2 hours ago

Trying to use Gemini CLI is one of the most frustrating experiences with any tool I've had in over two decades of working with software.

It's seemingly very hard to understand how it should be configured at all if you don't have a personal Google account. Rather than just using your credentials to login and start, you need to find some forum posts of people that have reversed engineered that you need to use a Google Cloud environment variable, even if you are operating without a "Code Assist License" on a Google Business account.

No matter what I do on my paid subscription through Google Business with a Google Cloud project provided in the environment configured, which I had to explicitly set up just to test the CLI even though I have access to the Models through my subscription and AI Studio, I always get error 429 after one to five messages. The limits that Google claim on Gemini seem to be just a fraction of what is claimed in my case, No clearly stated reason as to why, not in the cloud console and not when using the tool itself, except for the HTTP error message.

These are not big prompts or anything of that nature. It's simple things like review a readme file or double check a single file for errors. It's been like this from the very beginning.

Even now just to verify it, I havent used Gemini for over a week, I ask it to review 3 files that are in git diff, the files are between 50-100 lines long, after checking the first file it's already on 429, on a PAID subscription, and it even states "99%" context left. So my paid subscription lets me use less than 1% of the context window and I get locked out for a unknown amount of time.

Contrasting this to both Codex and Claue Code, where you just log in and go, it's really a night and day difference. The user experience of the paid version of Gemini CLI is just utterly terrible.

selvan 6 hours ago

From the blog " Gemini CLI spawns a new process within a pseudo-terminal in the background, leveraging the node-pty library...So how does this virtual terminal running in the background show up on your screen? Think of it like a video stream. Our new serializer takes a snapshot of the pseudo terminal at every moment—capturing every piece of text, every color, and even the cursor's position. These snapshots are then streamed to you, allowing you to see and interact with the terminal application in real-time. It's not just a stream of text; it's a live feed."

Terminal serializer code: https://github.com/google-gemini/gemini-cli/blob/main/packag...

Uses @xterm/headless npm package.

  • ffsm8 6 hours ago

    Your link 404s

    • selvan 6 hours ago

      Thanks. Fixed it.

thallavajhula 5 hours ago

Aside: The demo shows git commands being run in the CLI. I absolutely hate it when devs use a commit message that says "chore: my first commit from gemini cli" - I get that it's meant for the demo, but in general too, I've seen codebases that enforce these commit prefixes such as "chore", "feat", "bugfix" etc. Is there any real value to that? Besides wasting up the 50 character limit on the first line of the commit message, I don't see anything else being done including those. Also, non-imperative commit messages?! Come on, guys!

  • dustypotato 4 hours ago

    If you're looking in the commit tree for which commit fixed a certain bug, but didn't fix it fully , for example , you first look at all the `fix:` and then if it matches, you read the rest. You just write `fix: Thumbnail wasn't updating after upload` to `Fix for Thumbnail not updating after upload`, which isn't really wasting characters.

    But I'm also not a fan of this being an enforced convention because somebody higher up decided he/she it brings some value and now it's the 101st convention a new dev has to follow which actually reduces productivity.

  • ilikepi 5 hours ago

    > I've seen codebases that enforce these commit prefixes such as "chore", "feat", "bugfix" etc. Is there any real value to that?

    It's a choice some teams make, presumably because _they_ see value in it (or at least think they will). The team I'm on has particular practices which I'm sure would not work on other teams, and might cause you to look at them with the same incredulity, but they work for us.

    For what it's worth, the prefixes you use as examples do arise from a convention with an actual spec:

    https://www.conventionalcommits.org/en/v1.0.0/

    • jval43 5 hours ago

      Just because someone put up a fancy website and named it "conventional" doesn't mean it's a convention or that it's a good idea.

      The main reason this exists is because Angular was doing it to generate their changelogs from it. Which makes sense, but outside of that context it doesn't feel fully baked.

      I usually see junior devs make such commits, but at the same time they leave the actual commit message body completely empty and don't even include a reference to the ticket they're working on.

baalimago 6 hours ago

How much tokens does it eat up? Does the context stay concise? Who owns these "serializations" that's uploaded to google all the time?

coderatlarge 7 hours ago

i’ve had little luck getting ai systems to correctly set up networking for a set of vms. they tend to go round and round with ip tables commands that don’t ultimately solve the problem. is config fundamentally harder than writing code ?

  • theshrike79 6 hours ago

    Did you give them a way to check the networking rules?

    If not, the model is just shooting in the dark and guessing.

jeffrallen 4 hours ago

Gemini is comically bad, like so bad you wonder if the product managers even know what it is supposed to sound/look like when working with an LLM.

What the heck is going on in Google-land?