Skip to content

Bytecode parity - exception#7557

Merged
youknowone merged 3 commits intoRustPython:mainfrom
youknowone:bytecode-parity-exception
Apr 10, 2026
Merged

Bytecode parity - exception#7557
youknowone merged 3 commits intoRustPython:mainfrom
youknowone:bytecode-parity-exception

Conversation

@youknowone
Copy link
Copy Markdown
Member

@youknowone youknowone commented Apr 3, 2026

Summary by CodeRabbit

  • Bug Fixes

    • Corrected exception rethrow/unwind behavior across try/except/finally and exception-group paths.
    • Fixed exception stack handling during frame resumption and generator/coroutine resumption.
  • Performance

    • New bytecode optimization passes: tuple-unpack simplification, dead-store elimination, and swap removal.
    • Added rewrites and peephole adjustments to reduce redundant ops and nops; added checks for uninitialized loads.
  • Tooling

    • Added bytecode dump and comparison CLI tools for cross-interpreter validation.
  • Refactor

    • Simplified frame/exception-context APIs and internal exception-slot model.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 3, 2026

📝 Walkthrough

Walkthrough

Refactors exception handling across compiler and VM, moves reraise into cleanup scopes, adds IR optimization passes (tuple unpack/swap/dead-store elimination and uninitialized-load checks), removes explicit exception-passing frame API, and adds two bytecode-dump/compare scripts for CPython vs RustPython.

Changes

Cohort / File(s) Summary
Exception Control Flow Compilation
crates/codegen/src/compile.rs
Rewrote try/finally and except* emission: emit Reraise { depth: 0 } inside cleanup coverage, reorganize handler/cleanup/continuation blocks, preserve stack values for outer unwinds, adjust PopBlock placements, and relax super() optimization constraints to support EX_CALL forms (emit attribute-load + PushNull when needed).
IR-Level Optimization Passes
crates/codegen/src/ir.rs
Added passes: optimize_build_tuple_unpack, eliminate_dead_stores, apply_static_swaps, remove_redundant_nops_and_jumps, and add_checks_for_loads_of_uninitialized_variables; adjusted peephole superinstruction formation and cold-block location handling; added helpers to iteratively clean NOPs/jumps.
VM Exception/Frame Runtime
crates/vm/src/frame.rs, crates/vm/src/vm/mod.rs
Instruction::Reraise no longer uses depth to pop preserved values; changed frame exception model: removed with_frame_exc, snapshot/restore caller exception slot, reworked ExceptionStack default and resume_gen_frame to use push/pop semantics.
Bytecode Tooling
scripts/dis_dump.py, scripts/compare_bytecode.py
Added dis_dump.py to compile .py files to normalized JSON bytecode dumps and compare_bytecode.py to run both interpreters, collect dumps, compare instruction streams, and emit summary/detail reports (with worker parallelism, sampling, and structured output).

Sequence Diagram

sequenceDiagram
    participant Compiler as Codegen
    participant ExceptionTable as Exception Table
    participant Bytecode as Emitted Bytecode
    participant VM as Runtime VM
    participant Frame as Frame Stack

    Note over Compiler,ExceptionTable: Compile-time: new handler layout
    Compiler->>ExceptionTable: register cleanup & handler ranges
    Compiler->>Bytecode: emit TRY, cleanup, Reraise{depth:0} inside cleanup

    Note over Bytecode,VM: Runtime unwind now preserves extra values
    VM->>Frame: exception raised -> consult Exception Table
    Frame->>VM: unwind to cleanup handler (preserved values remain)
    VM->>Frame: outer exception-table unwinder discards preserved values to target
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly Related PRs

  • Bytecode parity #7541: Refactors try/except compilation (related to try/except handler emission and wiring changes).

Poem

🐰 I hopped through code with eager paws,

rearranged the try/finally laws.
Tuples slimmed and swaps set free,
frames now keep their history.
I sniffed bytecode, matched it true—hip-hop, we pass through!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Bytecode parity - exception' directly relates to the main objective of achieving bytecode compatibility with CPython for exception handling, as evidenced by substantial changes across exception-handling code paths (reraise, except*, with/async-with) and the addition of bytecode comparison scripts.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone force-pushed the bytecode-parity-exception branch 2 times, most recently from 7e7e111 to a15185e Compare April 5, 2026 05:06
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 5, 2026

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[ ] test: cpython/Lib/test/test_compile.py (TODO: 27)
[x] test: cpython/Lib/test/test_compiler_assemble.py
[x] test: cpython/Lib/test/test_compiler_codegen.py
[x] test: cpython/Lib/test/test_peepholer.py (TODO: 20)

dependencies:

dependent tests: (no tests depend on compile)

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

@youknowone youknowone force-pushed the bytecode-parity-exception branch 4 times, most recently from 876791a to 8b264e5 Compare April 6, 2026 14:28
@youknowone youknowone marked this pull request as ready for review April 10, 2026 02:15
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
crates/codegen/src/compile.rs (2)

3958-3987: ⚠️ Potential issue | 🔴 Critical

orelse is re-armed for finally on the wrong block.

SetupFinally and push_fblock_full(FinallyTry, ...) run before switch_to_block(else_block), so they attach to the current block (cleanup_block), which is dead after Reraise { depth: 1 }. The orelse bytecode then executes without an active finally setup, so an exception raised from orelse can skip finalbody.

Suggested fix
-        if !finalbody.is_empty() {
-            emit!(
-                self,
-                PseudoInstruction::SetupFinally {
-                    delta: finally_block
-                }
-            );
-            self.push_fblock_full(
-                FBlockType::FinallyTry,
-                finally_block,
-                finally_block,
-                FBlockDatum::FinallyBody(finalbody.to_vec()),
-            )?;
-        }
         if else_block != continuation_block {
             self.switch_to_block(else_block);
+            if !finalbody.is_empty() {
+                emit!(
+                    self,
+                    PseudoInstruction::SetupFinally {
+                        delta: finally_block
+                    }
+                );
+                self.push_fblock_full(
+                    FBlockType::FinallyTry,
+                    finally_block,
+                    finally_block,
+                    FBlockDatum::FinallyBody(finalbody.to_vec()),
+                )?;
+            }
             self.compile_statements(orelse)?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 3958 - 3987, The SetupFinally
emission and push_fblock_full(FBlockType::FinallyTry, ...) are being emitted
before switch_to_block(else_block) so the finally setup attaches to the current
(cleanup) block; move the emit!(..., PseudoInstruction::SetupFinally { delta:
finally_block }) and the self.push_fblock_full(FBlockType::FinallyTry,
finally_block, finally_block, FBlockDatum::FinallyBody(finalbody.to_vec())) call
to immediately after self.switch_to_block(else_block) and before
self.compile_statements(orelse) so the finally is armed on else_block; keep the
existing PopBlock emit and self.pop_fblock(FBlockType::FinallyTry) after
compiling orelse to unwind the finally for the else path.

4006-4023: ⚠️ Potential issue | 🔴 Critical

except* ... finally still drops its cleanup handler before rethrowing.

This exception-path finally still emits PopBlock/pop_fblock(FinallyEnd) before Reraise { depth: 0 }. That removes the cleanup range just before the original exception is re-raised, so COPY 3; POP_EXCEPT; RERAISE 1 never gets a chance to restore exc_info. It reintroduces the same bug you already fixed in compile_try_statement.

Suggested fix
-            if finally_cleanup_block.is_some() {
-                emit!(self, PseudoInstruction::PopBlock);
-                self.pop_fblock(FBlockType::FinallyEnd);
-            }
-
             emit!(self, Instruction::Reraise { depth: 0 });
+
+            if finally_cleanup_block.is_some() {
+                emit!(self, PseudoInstruction::PopBlock);
+                self.pop_fblock(FBlockType::FinallyEnd);
+            }
🧹 Nitpick comments (4)
crates/codegen/src/ir.rs (2)

1408-1427: Use instruction_lineno() helper for consistent line number handling.

The next_swappable function extracts line numbers directly from info.location.line.get(), ignoring any lineno_override that may be set (e.g., -1 for cold blocks). The existing instruction_lineno helper at line 3171 handles this correctly.

This inconsistency could cause incorrect line comparisons for instructions with overridden line numbers.

♻️ Proposed fix
         fn next_swappable(instrs: &[InstructionInfo], mut i: usize, lineno: i32) -> Option<usize> {
             loop {
                 i += 1;
                 if i >= instrs.len() {
                     return None;
                 }
                 let info = &instrs[i];
-                let info_lineno = info.location.line.get() as i32;
+                let info_lineno = instruction_lineno(info);
                 if lineno >= 0 && info_lineno > 0 && info_lineno != lineno {
                     return None;
                 }

Note: You'll need to either make instruction_lineno visible here or inline the logic:

let info_lineno = info.lineno_override.unwrap_or_else(|| info.location.line.get() as i32);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/ir.rs` around lines 1408 - 1427, In next_swappable, the
code reads the instruction line directly from info.location.line.get(), which
ignores any lineno_override; change it to use the existing instruction_lineno
helper (or inline its logic) so comparisons account for overrides (e.g., let
info_lineno = instruction_lineno(info) or use
info.lineno_override.unwrap_or_else(|| info.location.line.get() as i32)); update
next_swappable to call instruction_lineno/info.lineno_override to get
info_lineno before comparing with lineno.

1978-1980: Consider avoiding unnecessary clone when no changes occur.

The instructions are cloned unconditionally at line 1978, but the clone is only needed when changed becomes true. For blocks where no LOAD_FAST needs conversion to LOAD_FAST_CHECK, this clone is wasted work.

However, since this runs once during compilation and the optimization benefit is marginal, this is fine as-is.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/ir.rs` around lines 1978 - 1980, The code unconditionally
clones self.blocks[idx].instructions into old_instructions even when no changes
occur; to fix, iterate over references to self.blocks[idx].instructions (or take
a slice) and only allocate/clone into new_instructions when you detect a needed
change (when encountering a LOAD_FAST that must become LOAD_FAST_CHECK), then
replace or assign the modified Vec back into self.blocks[idx].instructions (or
use std::mem::replace/take) so blocks that are unchanged avoid the clone; update
uses of old_instructions/new_instructions accordingly (refer to
old_instructions, new_instructions, changed, and self.blocks[idx].instructions).
crates/codegen/src/compile.rs (1)

8509-8544: Please add an EX_CALL regression for optimized super() calls.

This introduces a new bytecode shape for super().method(*args, **kwargs) / super(C, self).method(*args). A focused disassembly or behavior test here would make future parity regressions much easier to catch.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 8509 - 8544, Add a regression
test that exercises the optimized super() call path (when
can_optimize_super_call returns Some) for both ZeroArg and TwoArg SuperCallType
so we catch the new CALL_FUNCTION_EX shape: construct snippets like
super().method(*args, **kwargs) and super(C, self).method(*args), compile them
and assert the generated bytecode/assembly contains the EX_CALL/CALL_FUNCTION_EX
path (or that uses_ex_call branch was taken) and the sequence produced by
emit_load_super_attr/emit_load_zero_super_attr plus the emitted PushNull and
codegen_call_helper invocation; place the test with other codegen/compile tests
and name it to indicate it’s a regression for optimized super EX_CALLs.
crates/vm/src/vm/mod.rs (1)

2067-2081: Consider removing unreachable else branch.

With ExceptionStack::default() now initializing the stack with vec![None], the stack should never be empty. The debug_assert on line 2070-2073 confirms this invariant, making the else branch (lines 2079-2080) unreachable dead code.

♻️ Suggested simplification
     pub(crate) fn set_exception(&self, exc: Option<PyBaseExceptionRef>) {
         // don't be holding the RefCell guard while __del__ is called
         let mut excs = self.exceptions.borrow_mut();
         debug_assert!(
             !excs.stack.is_empty(),
             "set_exception called with empty exception stack"
         );
-        if let Some(top) = excs.stack.last_mut() {
-            let prev = core::mem::replace(top, exc);
-            drop(excs);
-            drop(prev);
-        } else {
-            excs.stack.push(exc);
-            drop(excs);
-        }
+        let top = excs
+            .stack
+            .last_mut()
+            .expect("set_exception called with empty exception stack");
+        let prev = core::mem::replace(top, exc);
+        drop(excs);
+        drop(prev);
         #[cfg(feature = "threading")]
         thread::update_thread_exception(self.topmost_exception());
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/vm/mod.rs` around lines 2067 - 2081, The else branch in
set_exception is unreachable because ExceptionStack::default() seeds stack with
vec![None] and there's already a debug_assert that the stack is non-empty;
remove the unreachable else branch and simplify set_exception to always operate
on excs.stack.last_mut() (i.e., replace the top entry and drop the previous
value) while preserving the borrow_mut, debug_assert, core::mem::replace usage
and drop(prev) semantics so behavior and lifetimes around __del__ remain
unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/codegen/src/ir.rs`:
- Around line 1429-1431: Rename the local variable instrs to a non-spell-checked
name (e.g., instructions) to avoid the spell-check warning: in the loop over
self.blocks where you currently bind let instrs = &mut block.instructions;
change that binding to let instructions = &mut block.instructions; and update
all subsequent uses in this scope (same pattern as in
optimize_build_tuple_unpack) so references to instrs are replaced with
instructions.
- Around line 1514-1516: Rename the local variable named `instrs` to
`instructions` in the loop that iterates `for block in &mut self.blocks` (the
binding currently `let instrs = &mut block.instructions;`), and update all
references within that loop's scope to use `instructions` instead of `instrs` so
the spell checker no longer flags the abbreviation; leave `block.instructions`
unchanged.
- Around line 1345-1352: The local variable name `instrs` (declared as `let
instrs = &mut block.instructions;`) must be renamed to `instructions` to satisfy
the spell checker; update the binding and every use in this scope (e.g.,
`instrs.len()`, `instrs[i]`, `instrs[i + 1]`) to `instructions`, keeping the
same reference to `block.instructions` and preserving the surrounding logic that
checks `Instruction::BuildTuple` and `Instruction::UnpackSequence`.

In `@scripts/compare_bytecode.py`:
- Around line 103-127: The current _finish_one() swallows worker failures by
returning {} on timeout/parse errors and unlinking files_file so finish_dump()
loses those file paths; change _finish_one() to preserve the assigned file
entries on error by reading the files list from files_file before unlinking and
returning a fallback mapping (e.g., each filename -> [] or a special failure
marker) instead of {}, and ensure this path is taken for
subprocess.TimeoutExpired, non-zero proc.returncode, and JSONDecodeError cases
so finish_dump() receives the original file set; use the existing files_file and
proc symbols to locate and implement reading the file list and returning the
appropriate fallback result.
- Around line 59-61: The relative path used for filtering (relpath from
os.path.relpath(fpath, lib_dir)) may contain Windows backslashes and thus fail
fnmatch.fnmatch(pattern) checks; normalize relpath to POSIX separators (e.g. use
pathlib.Path(relpath).as_posix() or replace backslashes with '/') before calling
fnmatch.fnmatch(relpath, pattern) so patterns like "asyncio/*.py" match
cross-platform.
- Line 342: Replace the inline lambda assignments with proper function
definitions: change the assignment "log = lambda *a, **kw: print(*a,
file=sys.stderr, **kw)" to a def-based implementation (e.g., def log(*a, **kw):
print(*a, file=sys.stderr, **kw)) and do the same for the other lambda
assignment mentioned in the comment (the second lambda at line 424) so both
named callables are defined with def instead of lambdas to satisfy E731; keep
the same signature and behavior when converting to the new functions.

In `@scripts/dis_dump.py`:
- Around line 116-117: The _RP_CMP_OPS dict is incorrect and contains an
extraneous key 6; update the mapping for the COMPARE_OP discriminants in the
top-level symbol _RP_CMP_OPS to match RustPython's ComparisonOperator (0→"<",
1→"<=", 2→"==", 3→"!=", 4→">", 5→">="), removing the invalid key 6 so downstream
COMPARE_OP output is labelled correctly.
- Around line 134-141: The branch that treats deref-like opnames (the elif
listing
"LOAD_DEREF","STORE_DEREF","DELETE_DEREF","LOAD_CLOSURE","MAKE_CELL","COPY_FREE_VARS")
incorrectly treats COPY_FREE_VARS as a name/index into localsplus; remove
"COPY_FREE_VARS" from that tuple (or add an explicit case for COPY_FREE_VARS
before this branch) so its operand is always rendered as a numeric count rather
than passed through the deref/name fallback; update any comments/tests that
assume COPY_FREE_VARS is numeric.

---

Outside diff comments:
In `@crates/codegen/src/compile.rs`:
- Around line 3958-3987: The SetupFinally emission and
push_fblock_full(FBlockType::FinallyTry, ...) are being emitted before
switch_to_block(else_block) so the finally setup attaches to the current
(cleanup) block; move the emit!(..., PseudoInstruction::SetupFinally { delta:
finally_block }) and the self.push_fblock_full(FBlockType::FinallyTry,
finally_block, finally_block, FBlockDatum::FinallyBody(finalbody.to_vec())) call
to immediately after self.switch_to_block(else_block) and before
self.compile_statements(orelse) so the finally is armed on else_block; keep the
existing PopBlock emit and self.pop_fblock(FBlockType::FinallyTry) after
compiling orelse to unwind the finally for the else path.

---

Nitpick comments:
In `@crates/codegen/src/compile.rs`:
- Around line 8509-8544: Add a regression test that exercises the optimized
super() call path (when can_optimize_super_call returns Some) for both ZeroArg
and TwoArg SuperCallType so we catch the new CALL_FUNCTION_EX shape: construct
snippets like super().method(*args, **kwargs) and super(C, self).method(*args),
compile them and assert the generated bytecode/assembly contains the
EX_CALL/CALL_FUNCTION_EX path (or that uses_ex_call branch was taken) and the
sequence produced by emit_load_super_attr/emit_load_zero_super_attr plus the
emitted PushNull and codegen_call_helper invocation; place the test with other
codegen/compile tests and name it to indicate it’s a regression for optimized
super EX_CALLs.

In `@crates/codegen/src/ir.rs`:
- Around line 1408-1427: In next_swappable, the code reads the instruction line
directly from info.location.line.get(), which ignores any lineno_override;
change it to use the existing instruction_lineno helper (or inline its logic) so
comparisons account for overrides (e.g., let info_lineno =
instruction_lineno(info) or use info.lineno_override.unwrap_or_else(||
info.location.line.get() as i32)); update next_swappable to call
instruction_lineno/info.lineno_override to get info_lineno before comparing with
lineno.
- Around line 1978-1980: The code unconditionally clones
self.blocks[idx].instructions into old_instructions even when no changes occur;
to fix, iterate over references to self.blocks[idx].instructions (or take a
slice) and only allocate/clone into new_instructions when you detect a needed
change (when encountering a LOAD_FAST that must become LOAD_FAST_CHECK), then
replace or assign the modified Vec back into self.blocks[idx].instructions (or
use std::mem::replace/take) so blocks that are unchanged avoid the clone; update
uses of old_instructions/new_instructions accordingly (refer to
old_instructions, new_instructions, changed, and self.blocks[idx].instructions).

In `@crates/vm/src/vm/mod.rs`:
- Around line 2067-2081: The else branch in set_exception is unreachable because
ExceptionStack::default() seeds stack with vec![None] and there's already a
debug_assert that the stack is non-empty; remove the unreachable else branch and
simplify set_exception to always operate on excs.stack.last_mut() (i.e., replace
the top entry and drop the previous value) while preserving the borrow_mut,
debug_assert, core::mem::replace usage and drop(prev) semantics so behavior and
lifetimes around __del__ remain unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 363a271e-15bf-4199-b435-f764c39d9979

📥 Commits

Reviewing files that changed from the base of the PR and between eed618d and e5aeba7.

⛔ Files ignored due to path filters (2)
  • Lib/test/test_peepholer.py is excluded by !Lib/**
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_double_async_with.snap is excluded by !**/*.snap
📒 Files selected for processing (6)
  • crates/codegen/src/compile.rs
  • crates/codegen/src/ir.rs
  • crates/vm/src/frame.rs
  • crates/vm/src/vm/mod.rs
  • scripts/compare_bytecode.py
  • scripts/dis_dump.py

Comment thread crates/codegen/src/ir.rs
Comment on lines +1345 to +1352
let instrs = &mut block.instructions;
let len = instrs.len();
for i in 0..len.saturating_sub(1) {
let Some(Instruction::BuildTuple { .. }) = instrs[i].instr.real() else {
continue;
};
let n = u32::from(instrs[i].arg);
let Some(Instruction::UnpackSequence { .. }) = instrs[i + 1].instr.real() else {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix spell check failure: rename instrs to instructions.

The CI pipeline fails due to the spell checker not recognizing instrs. This is a blocking issue.

🔧 Proposed fix
     fn optimize_build_tuple_unpack(&mut self) {
         for block in &mut self.blocks {
-            let instrs = &mut block.instructions;
-            let len = instrs.len();
+            let instructions = &mut block.instructions;
+            let len = instructions.len();
             for i in 0..len.saturating_sub(1) {
-                let Some(Instruction::BuildTuple { .. }) = instrs[i].instr.real() else {
+                let Some(Instruction::BuildTuple { .. }) = instructions[i].instr.real() else {
                     continue;
                 };
-                let n = u32::from(instrs[i].arg);
-                let Some(Instruction::UnpackSequence { .. }) = instrs[i + 1].instr.real() else {
+                let n = u32::from(instructions[i].arg);
+                let Some(Instruction::UnpackSequence { .. }) = instructions[i + 1].instr.real() else {
                     continue;
                 };
-                if u32::from(instrs[i + 1].arg) != n {
+                if u32::from(instructions[i + 1].arg) != n {
                     continue;
                 }
                 match n {
                     1 => {
-                        instrs[i].instr = AnyInstruction::Real(Instruction::Nop);
-                        instrs[i].arg = OpArg::new(0);
-                        instrs[i + 1].instr = AnyInstruction::Real(Instruction::Nop);
-                        instrs[i + 1].arg = OpArg::new(0);
+                        instructions[i].instr = AnyInstruction::Real(Instruction::Nop);
+                        instructions[i].arg = OpArg::new(0);
+                        instructions[i + 1].instr = AnyInstruction::Real(Instruction::Nop);
+                        instructions[i + 1].arg = OpArg::new(0);
                     }
                     2 | 3 => {
-                        instrs[i].instr = AnyInstruction::Real(Instruction::Nop);
-                        instrs[i].arg = OpArg::new(0);
-                        instrs[i + 1].instr =
+                        instructions[i].instr = AnyInstruction::Real(Instruction::Nop);
+                        instructions[i].arg = OpArg::new(0);
+                        instructions[i + 1].instr =
                             AnyInstruction::Real(Instruction::Swap { i: Arg::marker() });
-                        instrs[i + 1].arg = OpArg::new(n);
+                        instructions[i + 1].arg = OpArg::new(n);
                     }
                     _ => {}
                 }
             }
         }
     }
🧰 Tools
🪛 GitHub Actions: CI

[error] 1345-1352: cspell failed (exit code 1). Unknown word(s) 'instrs' reported at lines 1345:17, 1346:23, 1348:60, 1351:35, 1352:64.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/ir.rs` around lines 1345 - 1352, The local variable name
`instrs` (declared as `let instrs = &mut block.instructions;`) must be renamed
to `instructions` to satisfy the spell checker; update the binding and every use
in this scope (e.g., `instrs.len()`, `instrs[i]`, `instrs[i + 1]`) to
`instructions`, keeping the same reference to `block.instructions` and
preserving the surrounding logic that checks `Instruction::BuildTuple` and
`Instruction::UnpackSequence`.

Comment thread crates/codegen/src/ir.rs
Comment on lines +1429 to +1431
for block in &mut self.blocks {
let instrs = &mut block.instructions;
let len = instrs.len();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Same spell check issue: rename instrs to instructions.

Same issue as in optimize_build_tuple_unpack—the instrs variable name triggers the spell checker.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/ir.rs` around lines 1429 - 1431, Rename the local variable
instrs to a non-spell-checked name (e.g., instructions) to avoid the spell-check
warning: in the loop over self.blocks where you currently bind let instrs = &mut
block.instructions; change that binding to let instructions = &mut
block.instructions; and update all subsequent uses in this scope (same pattern
as in optimize_build_tuple_unpack) so references to instrs are replaced with
instructions.

Comment thread crates/codegen/src/ir.rs
Comment on lines +1514 to +1516
for block in &mut self.blocks {
let instrs = &mut block.instructions;
let len = instrs.len();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Same spell check issue: rename instrs to instructions.

The instrs variable at line 1515 also triggers the spell checker failure.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/ir.rs` around lines 1514 - 1516, Rename the local variable
named `instrs` to `instructions` in the loop that iterates `for block in &mut
self.blocks` (the binding currently `let instrs = &mut block.instructions;`),
and update all references within that loop's scope to use `instructions` instead
of `instrs` so the spell checker no longer flags the abbreviation; leave
`block.instructions` unchanged.

Comment on lines +59 to +61
relpath = os.path.relpath(fpath, lib_dir)
if pattern and not fnmatch.fnmatch(relpath, pattern):
continue
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Normalize relpath before applying --filter.

On Windows, os.path.relpath() returns backslashes, so the documented filter syntax like asyncio/*.py will not match. Converting the relative path to POSIX separators before fnmatch keeps the CLI portable.

Suggested fix
             fpath = os.path.join(root, fname)
-            relpath = os.path.relpath(fpath, lib_dir)
-            if pattern and not fnmatch.fnmatch(relpath, pattern):
+            relpath = os.path.relpath(fpath, lib_dir)
+            match_path = relpath.replace(os.sep, "/")
+            if pattern and not fnmatch.fnmatch(match_path, pattern):
                 continue
             targets.append((relpath, fpath))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
relpath = os.path.relpath(fpath, lib_dir)
if pattern and not fnmatch.fnmatch(relpath, pattern):
continue
relpath = os.path.relpath(fpath, lib_dir)
match_path = relpath.replace(os.sep, "/")
if pattern and not fnmatch.fnmatch(match_path, pattern):
continue
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/compare_bytecode.py` around lines 59 - 61, The relative path used for
filtering (relpath from os.path.relpath(fpath, lib_dir)) may contain Windows
backslashes and thus fail fnmatch.fnmatch(pattern) checks; normalize relpath to
POSIX separators (e.g. use pathlib.Path(relpath).as_posix() or replace
backslashes with '/') before calling fnmatch.fnmatch(relpath, pattern) so
patterns like "asyncio/*.py" match cross-platform.

Comment on lines +103 to +127
def _finish_one(proc, files_file):
"""Wait for a single dis_dump.py process and return parsed JSON."""
try:
stdout = proc.communicate(timeout=600)[0]
except subprocess.TimeoutExpired:
proc.kill()
proc.communicate()
print(" Timeout (600s)", file=sys.stderr)
os.unlink(files_file)
return {}
finally:
if os.path.exists(files_file):
os.unlink(files_file)

if proc.returncode != 0:
print(" Warning: exited with code %d" % proc.returncode, file=sys.stderr)

content = stdout.decode(errors="replace").strip()
if not content:
return {}
try:
return json.loads(content)
except json.JSONDecodeError as e:
print(" JSON parse error: %s" % e, file=sys.stderr)
return {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't let a failed worker disappear from the final result set.

A timeout or JSON parse failure currently makes _finish_one() return {}, and finish_dump() then drops every file assigned to that worker. Those paths never make it into all_files, so the report can overstate coverage, and the current exit check can still return success after RustPython skipped files.

Suggested direction
-def _start_one(interpreter, targets, base_dir):
+def _start_one(interpreter, targets, base_dir):
@@
-        return proc, files_file.name
+        return proc, files_file.name, [relpath for relpath, _ in targets]

-def start_dump(interpreter, targets, base_dir, num_workers=1):
+def start_dump(interpreter, targets, base_dir, num_workers=1):
@@
-        return [(proc, ff)]
+        return [(proc, ff, [relpath for relpath, _ in targets])]
@@
-def finish_dump(procs):
+def finish_dump(procs):
     """Wait for all dis_dump.py processes and merge results."""
     merged = {}
-    for proc, files_file in procs:
-        merged.update(_finish_one(proc, files_file))
+    for proc, files_file, relpaths in procs:
+        data = _finish_one(proc, files_file)
+        if data:
+            merged.update(data)
+        else:
+            for relpath in relpaths:
+                merged.setdefault(
+                    relpath, {"status": "error", "error": "bytecode dump failed"}
+                )
     return merged
@@
-    sys.exit(0 if differ == 0 and rp_err == 0 else 1)
+    sys.exit(0 if differ == 0 and rp_err == 0 and rp_miss == 0 else 1)

Also applies to: 130-148, 361-417, 535-536

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/compare_bytecode.py` around lines 103 - 127, The current
_finish_one() swallows worker failures by returning {} on timeout/parse errors
and unlinking files_file so finish_dump() loses those file paths; change
_finish_one() to preserve the assigned file entries on error by reading the
files list from files_file before unlinking and returning a fallback mapping
(e.g., each filename -> [] or a special failure marker) instead of {}, and
ensure this path is taken for subprocess.TimeoutExpired, non-zero
proc.returncode, and JSONDecodeError cases so finish_dump() receives the
original file set; use the existing files_file and proc symbols to locate and
implement reading the file list and returning the appropriate fallback result.

sys.exit(1)

report_path = args.output
log = lambda *a, **kw: print(*a, file=sys.stderr, **kw)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, locate the file
find . -name "compare_bytecode.py" -type f

Repository: RustPython/RustPython

Length of output: 94


🏁 Script executed:

# Check if the file exists and get its size
wc -l scripts/compare_bytecode.py 2>/dev/null || echo "File not found in expected location"

Repository: RustPython/RustPython

Length of output: 96


🏁 Script executed:

# Look for the lambda assignments at the specified lines
sed -n '340,345p' scripts/compare_bytecode.py

Repository: RustPython/RustPython

Length of output: 269


🏁 Script executed:

# Check line 424
sed -n '422,426p' scripts/compare_bytecode.py

Repository: RustPython/RustPython

Length of output: 261


🏁 Script executed:

# Check for ruff configuration files
find . -maxdepth 2 -type f \( -name "pyproject.toml" -o -name "ruff.toml" -o -name ".ruff.toml" -o -name "setup.cfg" -o -name ".flake8" \)

Repository: RustPython/RustPython

Length of output: 86


🏁 Script executed:

# Search for ruff configuration in common files
grep -r "ruff\|E731\|flake8" pyproject.toml setup.cfg 2>/dev/null | head -20

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check ruff.toml
cat ruff.toml

Repository: RustPython/RustPython

Length of output: 1169


🏁 Script executed:

# Check .flake8
cat .flake8

Repository: RustPython/RustPython

Length of output: 116


Replace the lambda assignments to comply with E731.

Both lambda assignments at lines 342 and 424 violate the E731 rule in Ruff. Replace them with proper def statements:

Suggested fix
-    log = lambda *a, **kw: print(*a, file=sys.stderr, **kw)
+    def log(*a, **kw):
+        print(*a, file=sys.stderr, **kw)
@@
-        p = lambda *a: print(*a, file=out)
+        def p(*a):
+            print(*a, file=out)
🧰 Tools
🪛 Flake8 (7.3.0)

[error] 342-342: do not assign a lambda expression, use a def

(E731)

🪛 Ruff (0.15.9)

[error] 342-342: Do not assign a lambda expression, use a def

Rewrite log as a def

(E731)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/compare_bytecode.py` at line 342, Replace the inline lambda
assignments with proper function definitions: change the assignment "log =
lambda *a, **kw: print(*a, file=sys.stderr, **kw)" to a def-based implementation
(e.g., def log(*a, **kw): print(*a, file=sys.stderr, **kw)) and do the same for
the other lambda assignment mentioned in the comment (the second lambda at line
424) so both named callables are defined with def instead of lambdas to satisfy
E731; keep the same signature and behavior when converting to the new functions.

Comment thread scripts/dis_dump.py Outdated
Comment thread scripts/dis_dump.py
- Align except* bytecode chaining
- Fix exception state model and finally handler cleanup
- Fix RERAISE to only pop exception, preserve values below
- BUILD_TUPLE n + UNPACK_SEQUENCE n elimination
- Dead store elimination within basic blocks
- apply_static_swaps for SWAP reduction
- compare_bytecode.py: compare CPython vs RustPython bytecode output
- dis_dump.py: extract disassembly in normalized JSON format
@youknowone youknowone force-pushed the bytecode-parity-exception branch from e5aeba7 to 6824360 Compare April 10, 2026 03:14
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/codegen/src/compile.rs (1)

3958-3980: ⚠️ Potential issue | 🟠 Major

Move the re-pushed SetupFinally into else_block.

These instructions are emitted before switch_to_block(else_block), so they end up in cleanup_block, not on the try-success/else path. At runtime, orelse then executes PopBlock for a SETUP_FINALLY it never entered, and exceptions raised from orelse will bypass finally.

Proposed fix
-        if !finalbody.is_empty() {
-            emit!(
-                self,
-                PseudoInstruction::SetupFinally {
-                    delta: finally_block
-                }
-            );
-            self.push_fblock_full(
-                FBlockType::FinallyTry,
-                finally_block,
-                finally_block,
-                FBlockDatum::FinallyBody(finalbody.to_vec()),
-            )?;
-        }
         if else_block != continuation_block {
             self.switch_to_block(else_block);
+            if !finalbody.is_empty() {
+                emit!(
+                    self,
+                    PseudoInstruction::SetupFinally {
+                        delta: finally_block
+                    }
+                );
+                self.push_fblock_full(
+                    FBlockType::FinallyTry,
+                    finally_block,
+                    finally_block,
+                    FBlockDatum::FinallyBody(finalbody.to_vec()),
+                )?;
+            }
             self.compile_statements(orelse)?;
 
             if !finalbody.is_empty() {
                 // Pop the FinallyTry fblock we just pushed for the else path
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 3958 - 3980, The SetupFinally
emit and push_fblock_full for the finally path are emitted before
switch_to_block(else_block), so they land in cleanup_block instead of the else
path; move the emit!(..., PseudoInstruction::SetupFinally { delta: finally_block
}) and the subsequent self.push_fblock_full(FBlockType::FinallyTry,
finally_block, finally_block, FBlockDatum::FinallyBody(finalbody.to_vec())) so
they occur after switch_to_block(else_block) (i.e., inside the else branch
before compile_statements(orelse)) so the SetupFinally is actually entered on
the else/try-success path and the matching emit!(...,
PseudoInstruction::PopBlock) and self.pop_fblock(FBlockType::FinallyTry)
correspond to an entered finally block.
♻️ Duplicate comments (3)
scripts/compare_bytecode.py (3)

66-98: ⚠️ Potential issue | 🟠 Major

Don't let a failed worker drop its entire file set.

When _finish_one() hits a timeout or JSON error, it returns {}. finish_dump() then loses every file assigned to that worker, so those paths never reach all_files; the report can undercount coverage and still exit successfully after RustPython skipped work. Carry the worker's relpaths alongside (proc, files_file) and synthesize per-file error entries on timeout, non-zero exit, or JSON parse failure instead of returning an empty mapping.

Also applies to: 103-148

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/compare_bytecode.py` around lines 66 - 98, The worker currently
returns an empty mapping on timeout/JSON errors causing finish_dump to drop all
files for that worker; modify _start_one to return the relpath list alongside
(proc, files_file) (e.g., return (proc, files_file, relpaths)) so callers have
the assigned paths, and update _finish_one/finish_dump to, on timeout, non-zero
exit code, or JSON parse failure, synthesize per-file error entries for each
relpath (marking failure/reason) instead of returning {} and ensure those
synthesized entries are merged into all_files so no files are lost from the
final report; update any code unpacking the tuple to expect the extra relpaths
element.

342-342: ⚠️ Potential issue | 🟡 Minor

Replace the named lambdas with local defs.

Both assignments still trip Ruff/Flake8 E731. Plain local functions keep the same behavior without failing lint.

Suggested fix
-    log = lambda *a, **kw: print(*a, file=sys.stderr, **kw)
+    def log(*a, **kw):
+        print(*a, file=sys.stderr, **kw)
@@
-        p = lambda *a: print(*a, file=out)
+        def p(*a):
+            print(*a, file=out)

As per coding guidelines, "Use ruff for linting Python code".

Also applies to: 424-424

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/compare_bytecode.py` at line 342, The code assigns a named lambda to
a variable (e.g., the lambda assigned to "log" and a similar named-lambda
later), which triggers Ruff/Flake8 E731; replace each named lambda with an
equivalent local def (for example, define def log(*a, **kw): print(*a,
file=sys.stderr, **kw)) so behavior remains identical but lint passes, and
update any callers to use the new local function name.

59-60: ⚠️ Potential issue | 🟡 Minor

Normalize the filter path before calling fnmatch().

Line 59 produces OS-native separators, so on Windows a documented filter like asyncio/*.py will not match. Compare against a POSIX-normalized copy and keep relpath unchanged for reporting.

Suggested fix
             fpath = os.path.join(root, fname)
             relpath = os.path.relpath(fpath, lib_dir)
-            if pattern and not fnmatch.fnmatch(relpath, pattern):
+            match_path = relpath.replace(os.sep, "/")
+            if pattern and not fnmatch.fnmatch(match_path, pattern):
                 continue
             targets.append((relpath, fpath))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/compare_bytecode.py` around lines 59 - 60, The fnmatch check uses
OS-native relpath which on Windows uses backslashes, so normalize a POSIX-style
copy for matching and keep relpath unchanged for reporting: create a
POSIX-normalized variant (e.g., via pathlib.PurePosixPath(relpath).as_posix() or
relpath.replace(os.sep, "/")) and call fnmatch.fnmatch(posix_relpath, pattern)
instead of fnmatch.fnmatch(relpath, pattern) in the block that contains relpath
and pattern.
🧹 Nitpick comments (1)
crates/codegen/src/ir.rs (1)

1543-1543: Use alloc or crate-level collections for consistency.

This function uses std::collections::HashMap while the rest of the file uses alloc::collections (line 1 imports VecDeque from alloc). This inconsistency could cause issues if the crate needs no_std compatibility.

Consider using a local Vec or the crate's IndexMap instead, which would also maintain insertion order (though not strictly needed here).

♻️ Suggested alternative using a simple Vec scan
                 if run_end - run_start >= 2 {
-                    // Pass 1: find the LAST occurrence of each variable
-                    let mut last_occurrence = std::collections::HashMap::new();
-                    for (j, instr) in instrs[run_start..run_end].iter().enumerate() {
-                        last_occurrence.insert(u32::from(instr.arg), j);
-                    }
-                    // Pass 2: non-last stores to the same variable are dead
-                    for (j, instr) in instrs[run_start..run_end].iter_mut().enumerate() {
-                        let idx = u32::from(instr.arg);
-                        if last_occurrence[&idx] != j {
+                    // Pass 1: find which positions are NOT the last occurrence
+                    let run_len = run_end - run_start;
+                    let mut is_dead = vec![false; run_len];
+                    for j in 0..run_len {
+                        let idx = u32::from(instrs[run_start + j].arg);
+                        // Mark earlier occurrences as dead
+                        for k in 0..j {
+                            if u32::from(instrs[run_start + k].arg) == idx {
+                                is_dead[k] = true;
+                            }
+                        }
+                    }
+                    // Pass 2: replace dead stores with POP_TOP
+                    for (j, &dead) in is_dead.iter().enumerate() {
+                        if dead {
+                            let instr = &mut instrs[run_start + j];
                             instr.instr = AnyInstruction::Real(Instruction::PopTop);
                             instr.arg = OpArg::new(0);
                         }
                     }
                 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/ir.rs` at line 1543, Replace the std::collections::HashMap
usage for last_occurrence with an alloc-friendly structure: either use a simple
Vec<(KeyType, usize)> named last_occurrence and do a small linear search/update
where you currently insert/lookup, or switch to a crate-level map like
indexmap::IndexMap (or hashbrown::HashMap) if you need faster lookups; update
all uses of last_occurrence (creation and lookup/insert sites) in the
surrounding function so they operate on the chosen structure to preserve
no_std/alloc compatibility and maintain correct semantics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/compare_bytecode.py`:
- Around line 30-45: find_rustpython() fails on Windows because it only checks
for "rustpython" not "rustpython.exe" and the debug-build guard message is
misleading; change the function to compute executable_name = "rustpython.exe"
when sys.platform == "win32" else "rustpython", use that executable_name when
building debug_fragment (os.path.join("target","debug", executable_name)) for
the RUSTPYTHON debug check and when constructing the release path
(os.path.join(PROJECT_ROOT,"target","release", executable_name)), and update the
ValueError text to reference the platform-specific
target/debug/<executable_name> so both autodiscovery and the debug-build guard
work on Windows.

In `@scripts/dis_dump.py`:
- Around line 178-189: The current _extract_instructions function swallows
exceptions from dis.get_instructions() and returns a synthetic [["ERROR", ...]]
instruction list; change it to let the exception propagate (remove the
try/except or re-raise the caught exception) so process_file's existing
error-handling path can mark the file with status: "error"; apply the same
change to the second identical try/except block referenced around lines 313-323
so all disassembly failures bubble up to process_file instead of producing
synthetic bytecode.

---

Outside diff comments:
In `@crates/codegen/src/compile.rs`:
- Around line 3958-3980: The SetupFinally emit and push_fblock_full for the
finally path are emitted before switch_to_block(else_block), so they land in
cleanup_block instead of the else path; move the emit!(...,
PseudoInstruction::SetupFinally { delta: finally_block }) and the subsequent
self.push_fblock_full(FBlockType::FinallyTry, finally_block, finally_block,
FBlockDatum::FinallyBody(finalbody.to_vec())) so they occur after
switch_to_block(else_block) (i.e., inside the else branch before
compile_statements(orelse)) so the SetupFinally is actually entered on the
else/try-success path and the matching emit!(..., PseudoInstruction::PopBlock)
and self.pop_fblock(FBlockType::FinallyTry) correspond to an entered finally
block.

---

Duplicate comments:
In `@scripts/compare_bytecode.py`:
- Around line 66-98: The worker currently returns an empty mapping on
timeout/JSON errors causing finish_dump to drop all files for that worker;
modify _start_one to return the relpath list alongside (proc, files_file) (e.g.,
return (proc, files_file, relpaths)) so callers have the assigned paths, and
update _finish_one/finish_dump to, on timeout, non-zero exit code, or JSON parse
failure, synthesize per-file error entries for each relpath (marking
failure/reason) instead of returning {} and ensure those synthesized entries are
merged into all_files so no files are lost from the final report; update any
code unpacking the tuple to expect the extra relpaths element.
- Line 342: The code assigns a named lambda to a variable (e.g., the lambda
assigned to "log" and a similar named-lambda later), which triggers Ruff/Flake8
E731; replace each named lambda with an equivalent local def (for example,
define def log(*a, **kw): print(*a, file=sys.stderr, **kw)) so behavior remains
identical but lint passes, and update any callers to use the new local function
name.
- Around line 59-60: The fnmatch check uses OS-native relpath which on Windows
uses backslashes, so normalize a POSIX-style copy for matching and keep relpath
unchanged for reporting: create a POSIX-normalized variant (e.g., via
pathlib.PurePosixPath(relpath).as_posix() or relpath.replace(os.sep, "/")) and
call fnmatch.fnmatch(posix_relpath, pattern) instead of fnmatch.fnmatch(relpath,
pattern) in the block that contains relpath and pattern.

---

Nitpick comments:
In `@crates/codegen/src/ir.rs`:
- Line 1543: Replace the std::collections::HashMap usage for last_occurrence
with an alloc-friendly structure: either use a simple Vec<(KeyType, usize)>
named last_occurrence and do a small linear search/update where you currently
insert/lookup, or switch to a crate-level map like indexmap::IndexMap (or
hashbrown::HashMap) if you need faster lookups; update all uses of
last_occurrence (creation and lookup/insert sites) in the surrounding function
so they operate on the chosen structure to preserve no_std/alloc compatibility
and maintain correct semantics.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 37dcc6ca-ae98-4afc-9d45-5142e130ea84

📥 Commits

Reviewing files that changed from the base of the PR and between e5aeba7 and 6824360.

⛔ Files ignored due to path filters (2)
  • Lib/test/test_peepholer.py is excluded by !Lib/**
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_double_async_with.snap is excluded by !**/*.snap
📒 Files selected for processing (6)
  • crates/codegen/src/compile.rs
  • crates/codegen/src/ir.rs
  • crates/vm/src/frame.rs
  • crates/vm/src/vm/mod.rs
  • scripts/compare_bytecode.py
  • scripts/dis_dump.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/vm/src/vm/mod.rs

Comment on lines +30 to +45
def find_rustpython():
"""Locate the RustPython binary, allowing release builds only."""
if "RUSTPYTHON" in os.environ:
path = os.environ["RUSTPYTHON"]
normalized = os.path.normpath(path)
debug_fragment = os.path.join("target", "debug", "rustpython")
if normalized.endswith(debug_fragment):
raise ValueError(
"RUSTPYTHON must point to a release binary, not target/debug/rustpython"
)
return path

path = os.path.join(PROJECT_ROOT, "target", "release", "rustpython")
if os.path.isfile(path) and os.access(path, os.X_OK):
return path
return None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, verify the actual code at the specified lines
head -50 scripts/compare_bytecode.py | tail -25

Repository: RustPython/RustPython

Length of output: 967


🏁 Script executed:

# Search for Windows platform handling patterns in scripts
rg -i "windows|\.exe|os\.name|sys\.platform" scripts/ -A 2 -B 2

Repository: RustPython/RustPython

Length of output: 4746


🏁 Script executed:

# Check if there are other executable discovery patterns in the project
rg "target.*release|target.*debug" . -g "*.py" -g "*.rs" -A 2 -B 2

Repository: RustPython/RustPython

Length of output: 7437


Handle the Windows executable name in find_rustpython().

Line 44 only probes target/release/rustpython, and line 37 only rejects target/debug/rustpython. On Windows that misses rustpython.exe, so autodiscovery fails and the debug-build guard also stops working for RUSTPYTHON=...\target\debug\rustpython.exe. Use the platform-specific executable name for both paths, and update the error message accordingly.

For consistency with scripts/update_lib/cmd_auto_mark.py, use sys.platform == "win32" instead of os.name == "nt".

Suggested fix
 def find_rustpython():
     """Locate the RustPython binary, allowing release builds only."""
+    exe_name = "rustpython.exe" if sys.platform == "win32" else "rustpython"
     if "RUSTPYTHON" in os.environ:
         path = os.environ["RUSTPYTHON"]
         normalized = os.path.normpath(path)
-        debug_fragment = os.path.join("target", "debug", "rustpython")
+        debug_fragment = os.path.join("target", "debug", exe_name)
         if normalized.endswith(debug_fragment):
             raise ValueError(
-                "RUSTPYTHON must point to a release binary, not target/debug/rustpython"
+                f"RUSTPYTHON must point to a release binary, not target/debug/{exe_name}"
             )
         return path
 
-    path = os.path.join(PROJECT_ROOT, "target", "release", "rustpython")
+    path = os.path.join(PROJECT_ROOT, "target", "release", exe_name)
     if os.path.isfile(path) and os.access(path, os.X_OK):
         return path
     return None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/compare_bytecode.py` around lines 30 - 45, find_rustpython() fails on
Windows because it only checks for "rustpython" not "rustpython.exe" and the
debug-build guard message is misleading; change the function to compute
executable_name = "rustpython.exe" when sys.platform == "win32" else
"rustpython", use that executable_name when building debug_fragment
(os.path.join("target","debug", executable_name)) for the RUSTPYTHON debug check
and when constructing the release path
(os.path.join(PROJECT_ROOT,"target","release", executable_name)), and update the
ValueError text to reference the platform-specific
target/debug/<executable_name> so both autodiscovery and the debug-build guard
work on Windows.

Comment thread scripts/dis_dump.py
Comment on lines +178 to +189
def _extract_instructions(code):
"""Extract normalized instruction list from a code object.

- Filters out CACHE/PRECALL instructions
- Converts jump targets from byte offsets to instruction indices
- Resolves argument names via fallback when argrepr is missing
- Normalizes argument representations
"""
try:
raw = list(dis.get_instructions(code))
except Exception as e:
return [["ERROR", str(e)]]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Bubble disassembly failures into status: "error" instead of synthetic bytecode.

If dis.get_instructions() fails here, the file still comes back from process_file() as "status": "ok" with [["ERROR", ...]] in its instruction stream. scripts/compare_bytecode.py compares those lists verbatim, so extractor failures get reported as bytecode diffs instead of compile/dump errors. Let _extract_instructions() raise and reuse the existing process_file() error path.

Suggested fix
 def _extract_instructions(code):
@@
     try:
         raw = list(dis.get_instructions(code))
     except Exception as e:
-        return [["ERROR", str(e)]]
+        raise RuntimeError(
+            "failed to disassemble %s" % (getattr(code, "co_qualname", None) or code.co_name)
+        ) from e

Also applies to: 313-323

🧰 Tools
🪛 Ruff (0.15.9)

[warning] 188-188: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/dis_dump.py` around lines 178 - 189, The current
_extract_instructions function swallows exceptions from dis.get_instructions()
and returns a synthetic [["ERROR", ...]] instruction list; change it to let the
exception propagate (remove the try/except or re-raise the caught exception) so
process_file's existing error-handling path can mark the file with status:
"error"; apply the same change to the second identical try/except block
referenced around lines 313-323 so all disassembly failures bubble up to
process_file instead of producing synthetic bytecode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant