Saumitra Mishra

6

Papers

13

Total Citations

Papers (6)

Interpreting Language Reward Models via Contrastive Explanations

Quantifying Prediction Consistency Under Fine-tuning Multiplicity in Tabular LLMs

Representation Consistency for Accurate and Coherent LLM Answer Aggregation

NeurIPS 2025arXiv

To Steer or Not to Steer? Mechanistic Error Reduction with Abstention for Language Models

Progressive Inference: Explaining Decoder-Only Sequence Classification Models Using Intermediate Predictions

Counterfactual Metarules for Local and Global Recourse