Files
codex/codex-rs/tui/src/markdown_text_merge.rs
Felipe Coury 4ca2e436e5 fix(tui): linkify complete bare URLs with tildes (#27088)
## Background

Bare URLs containing `~` in their path are currently only clickable up
to the tilde in the interactive TUI. For example, Codex renders the
visible text for:


`https://www.cs.tufts.edu/~nr/cs257/archive/olin-shivers/dissertation.pdf`

but the OSC 8 destination stops at `https://www.cs.tufts.edu/`. This
makes Cmd-click open the wrong location even though the terminal
recognizes the complete URL outside Codex.

Fixes #26774.

## Root Cause

The URL scanner already accepts `~`. The truncation happens earlier:
with strikethrough parsing enabled, `pulldown-cmark` splits this URL
into adjacent decoded `Event::Text` values around the tilde. The
Markdown renderer annotated each text event independently, so only the
first event still looked like a complete URL with a supported scheme.

The renderer now merges adjacent decoded text events before URL
annotation. It preserves the combined source range while retaining
parser-decoded contents, which avoids regressing entities such as
`&`.

## Changes

- Add a small iterator that merges adjacent decoded Markdown text events
and their source ranges.
- Apply it at the Markdown renderer boundary before hyperlink detection.
- Add regression coverage for the reported URL in prose, wrapped table
output, and entity-decoded URLs.

## How to Test

1. Run Codex with `just c`.
2. Ask the assistant to output this exact bare URL with no Markdown link
syntax:

`https://www.cs.tufts.edu/~nr/cs257/archive/olin-shivers/dissertation.pdf`
3. Hold Cmd and hover or click the URL.
4. Confirm the complete URL, including the suffix after `~`, is one
destination.
5. Repeat with the URL inside a Markdown table and confirm wrapped
portions retain the same complete destination.

Targeted tests:

- `just test -p codex-tui url_with_tilde`
- `just test -p codex-tui merged_text_events_preserve_entity_decoding`

The full `codex-tui` test run was also executed. Its only failures were
the two existing Guardian feature-flag tests:

-
`app::tests::update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default`
-
`app::tests::update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`
2026-06-08 17:02:36 -07:00

51 lines
1.6 KiB
Rust

//! Markdown text-event merging that preserves parser-decoded contents and source offsets.
use std::iter::Peekable;
use std::ops::Range;
use pulldown_cmark::Event;
/// Merges adjacent parsed text events without reconstructing them from the Markdown source.
///
/// Markdown extensions can split visually contiguous text around delimiter characters. Keeping the
/// decoded event contents together lets downstream consumers recognize tokens that span those
/// parser boundaries while the combined source range remains available for offset-aware rendering.
pub(crate) struct DecodedTextMerge<I: Iterator> {
iter: Peekable<I>,
}
impl<I: Iterator> DecodedTextMerge<I> {
pub(crate) fn new(iter: I) -> Self {
Self {
iter: iter.peekable(),
}
}
}
impl<'a, I> Iterator for DecodedTextMerge<I>
where
I: Iterator<Item = (Event<'a>, Range<usize>)>,
{
type Item = (Event<'a>, Range<usize>);
fn next(&mut self) -> Option<Self::Item> {
let (event, mut range) = self.iter.next()?;
let Event::Text(text) = event else {
return Some((event, range));
};
if !matches!(self.iter.peek(), Some((Event::Text(_), _))) {
return Some((Event::Text(text), range));
}
let mut merged = text.into_string();
while matches!(self.iter.peek(), Some((Event::Text(_), _))) {
let Some((Event::Text(text), next_range)) = self.iter.next() else {
break;
};
merged.push_str(&text);
range.end = next_range.end;
}
Some((Event::Text(merged.into()), range))
}
}