Yuandong Tian

38
Papers
406
Total Citations

Papers (38)

ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games

NeurIPS 2017arXiv
131
citations

AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs

ICML 2025
123
citations

NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions

NeurIPS 2025
46
citations

JoMA: Demystifying Multilayer Transformers via Joint Dynamics of MLP and Attention

ICLR 2024
46
citations

Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning

ICML 2025
45
citations

From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories, and Applications

ICML 2025
15
citations

LoCoCo: Dropping In Convolutions for Long Context Compression

ICML 2024
0
citations

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

ICML 2024
0
citations

GenCO: Generating Diverse Designs with Combinatorial Constraints

ICML 2024
0
citations

TravelPlanner: A Benchmark for Real-World Planning with Language Agents

ICML 2024
0
citations

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

ICML 2024
0
citations

Contrastive Predict-and-Search for Mixed Integer Linear Programs

ICML 2024
0
citations

Semantic Amodal Segmentation

CVPR 2017arXiv
0
citations

FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search

CVPR 2019
0
citations

FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions

CVPR 2020arXiv
0
citations

FP-NAS: Fast Probabilistic Neural Architecture Search

CVPR 2021
0
citations

FBNetV3: Joint Architecture-Recipe Search Using Predictor Pretraining

CVPR 2021arXiv
0
citations

On the Importance of Asymmetry for Siamese Representation Learning

CVPR 2022arXiv
0
citations

Bayesian Relational Memory for Semantic Visual Navigation

ICCV 2019
0
citations

Param$\Delta$ for Direct Mixing: Post-Train Large Language Model At Zero Cost

ICLR 2025
0
citations

Coda: An End-to-End Neural Program Decompiler

NeurIPS 2019
0
citations

Learning to Perform Local Rewriting for Combinatorial Optimization

NeurIPS 2019
0
citations

Hierarchical Decision Making by Generating and Following Natural Language Instructions

NeurIPS 2019
0
citations

One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers

NeurIPS 2019
0
citations

Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search

NeurIPS 2020
0
citations

Joint Policy Search for Multi-agent Collaboration with Imperfect Information

NeurIPS 2020
0
citations

Learning Space Partitions for Path Planning

NeurIPS 2021
0
citations

MADE: Exploration via Maximizing Deviation from Explored Regions

NeurIPS 2021
0
citations

Latent Execution for Neural Program Synthesis Beyond Domain-Specific Languages

NeurIPS 2021
0
citations

NovelD: A Simple yet Effective Exploration Criterion

NeurIPS 2021
0
citations

DreamShard: Generalizable Embedding Table Placement for Recommender Systems

NeurIPS 2022
0
citations

Understanding Deep Contrastive Learning via Coordinate-wise Optimization

NeurIPS 2022
0
citations

Landscape Surrogate: Learning Decision Losses for Mathematical Optimization Under Partial Information

NeurIPS 2023
0
citations

H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

NeurIPS 2023
0
citations

Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer

NeurIPS 2023
0
citations

An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis

ICML 2017
0
citations

Gradient Descent Learns One-hidden-layer CNN: Don’t be Afraid of Spurious Local Minima

ICML 2018
0
citations

ELF OpenGo: an analysis and open reimplementation of AlphaZero

ICML 2019
0
citations