mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
4ca2e436e5
## Background Bare URLs containing `~` in their path are currently only clickable up to the tilde in the interactive TUI. For example, Codex renders the visible text for: `https://www.cs.tufts.edu/~nr/cs257/archive/olin-shivers/dissertation.pdf` but the OSC 8 destination stops at `https://www.cs.tufts.edu/`. This makes Cmd-click open the wrong location even though the terminal recognizes the complete URL outside Codex. Fixes #26774. ## Root Cause The URL scanner already accepts `~`. The truncation happens earlier: with strikethrough parsing enabled, `pulldown-cmark` splits this URL into adjacent decoded `Event::Text` values around the tilde. The Markdown renderer annotated each text event independently, so only the first event still looked like a complete URL with a supported scheme. The renderer now merges adjacent decoded text events before URL annotation. It preserves the combined source range while retaining parser-decoded contents, which avoids regressing entities such as `&`. ## Changes - Add a small iterator that merges adjacent decoded Markdown text events and their source ranges. - Apply it at the Markdown renderer boundary before hyperlink detection. - Add regression coverage for the reported URL in prose, wrapped table output, and entity-decoded URLs. ## How to Test 1. Run Codex with `just c`. 2. Ask the assistant to output this exact bare URL with no Markdown link syntax: `https://www.cs.tufts.edu/~nr/cs257/archive/olin-shivers/dissertation.pdf` 3. Hold Cmd and hover or click the URL. 4. Confirm the complete URL, including the suffix after `~`, is one destination. 5. Repeat with the URL inside a Markdown table and confirm wrapped portions retain the same complete destination. Targeted tests: - `just test -p codex-tui url_with_tilde` - `just test -p codex-tui merged_text_events_preserve_entity_decoding` The full `codex-tui` test run was also executed. Its only failures were the two existing Guardian feature-flag tests: - `app::tests::update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default` - `app::tests::update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`
51 lines
1.6 KiB
Rust
51 lines
1.6 KiB
Rust
//! Markdown text-event merging that preserves parser-decoded contents and source offsets.
|
|
|
|
use std::iter::Peekable;
|
|
use std::ops::Range;
|
|
|
|
use pulldown_cmark::Event;
|
|
|
|
/// Merges adjacent parsed text events without reconstructing them from the Markdown source.
|
|
///
|
|
/// Markdown extensions can split visually contiguous text around delimiter characters. Keeping the
|
|
/// decoded event contents together lets downstream consumers recognize tokens that span those
|
|
/// parser boundaries while the combined source range remains available for offset-aware rendering.
|
|
pub(crate) struct DecodedTextMerge<I: Iterator> {
|
|
iter: Peekable<I>,
|
|
}
|
|
|
|
impl<I: Iterator> DecodedTextMerge<I> {
|
|
pub(crate) fn new(iter: I) -> Self {
|
|
Self {
|
|
iter: iter.peekable(),
|
|
}
|
|
}
|
|
}
|
|
|
|
impl<'a, I> Iterator for DecodedTextMerge<I>
|
|
where
|
|
I: Iterator<Item = (Event<'a>, Range<usize>)>,
|
|
{
|
|
type Item = (Event<'a>, Range<usize>);
|
|
|
|
fn next(&mut self) -> Option<Self::Item> {
|
|
let (event, mut range) = self.iter.next()?;
|
|
let Event::Text(text) = event else {
|
|
return Some((event, range));
|
|
};
|
|
if !matches!(self.iter.peek(), Some((Event::Text(_), _))) {
|
|
return Some((Event::Text(text), range));
|
|
}
|
|
|
|
let mut merged = text.into_string();
|
|
while matches!(self.iter.peek(), Some((Event::Text(_), _))) {
|
|
let Some((Event::Text(text), next_range)) = self.iter.next() else {
|
|
break;
|
|
};
|
|
merged.push_str(&text);
|
|
range.end = next_range.end;
|
|
}
|
|
Some((Event::Text(merged.into()), range))
|
|
}
|
|
}
|