Files
Charlie Marsh 7da4af622f Optimize unbounded byte scans with memchr (#26265)
## Summary

This PR adds `memchr` for some low-hanging performance improvements
(namely, in MCP stdio, Ollama streaming, and full message-history
newline counts).

Codex produced the following release benchmarks:

| Operation | Before | After | Speedup |
| --- | ---: | ---: | ---: |
| MCP 1 MiB chunked line | 2.172 s | 3.984 ms | 545x |
| Ollama 1 MiB chunked line | 1.673 s | 2.790 ms | 600x |
| Count newlines in 10 MiB history | 132.83 ms | 20.05 ms | 6.6x |

With a "real" MCP setup (`ExecutorStdioServerLauncher` started a Python
MCP server, completed `initialize`, requested `tools/list`, and
deserialized a 1 MiB tool description over newline-delimited stdio),
it's about 16x faster end-to-end:

| Branch | 50 calls | Per call |
| --- | ---: | ---: |
| `main` | 862.53 ms | 17.25 ms |
| this branch | 53.89 ms | 1.08 ms |

`memchr` is already in our dependency tree and extremely widely used for
this kind of optimized scanning.
2026-06-04 09:53:08 -04:00

33 lines
861 B
Rust

use bytes::BytesMut;
use memchr::memchr;
#[derive(Default)]
#[cfg_attr(test, derive(Debug, PartialEq, Eq))]
pub(crate) struct LineBuffer {
bytes: BytesMut,
/// Prefix already scanned and known not to contain a newline.
scanned_len: usize,
}
impl LineBuffer {
pub(crate) fn extend_from_slice(&mut self, bytes: &[u8]) {
self.bytes.extend_from_slice(bytes);
}
pub(crate) fn take_line(&mut self) -> Option<BytesMut> {
let Some(relative_index) = memchr(b'\n', &self.bytes[self.scanned_len..]) else {
self.scanned_len = self.bytes.len();
return None;
};
let newline_index = self.scanned_len + relative_index;
let line = self.bytes.split_to(newline_index + 1);
self.scanned_len = 0;
Some(line)
}
}
#[cfg(test)]
#[path = "line_buffer_tests.rs"]
mod tests;