Page Not Found
Page not found. Your pixels are in another canvas.
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Page not found. Your pixels are in another canvas.
About me
This is a page not in th emain menu
Published:
The wavelet-based variance transformation method is used for system modelling and prediction. It refines predictor spectral representation using Wavelet Theory, which leads to improved model specifications and prediction accuracy. A supporting open-source software, Wavelet System Prediction (WASP), can be found under page of Software.
Published:
GitHub Pages is a static site hosting service that takes HTML, CSS, and JavaScript files straight from a repository on GitHub, optionally runs the files through a build process, and publishes a website.
Published:
A word cloud is a collection or cluster of words described in different sizes. The larger the word appears, the more times it is mentioned in a given text, and the more important it is.
Published:
LaTeX is a document preparation system for high-quality typesetting. It is most often used for medium-to-large technical or scientific documents, but it can be used for almost any form of publishing.
Ph.D. student at Renmin University of China 
Published:
A cost-effective in-context learning framework for entity resolution.
Published:
A multi-agent data preparation framework with planner and programmer agents.
Published:
An LLM-powered data agent system for autonomous data preparation, featuring tree-based reasoning and post-training.
Published in Wireless Communications and Mobile Computing, 2021
This paper is about Question Answering over Knowledge Base.
Recommended citation: Fan, Meihao, Lei Zhang, Siyao Xiao, and Yuru Liang. "Few-shot multi-hop question answering over knowledge base." arXiv preprint arXiv:2112.11909 (2021). https://onlinelibrary.wiley.com/doi/abs/10.1155/2022/8045535
Published in IEEE 40th International Conference on Data Engineering (ICDE), 2024
This paper is about LLM for Entity Resolution.
Recommended citation: Fan, Meihao, Xiaoyue Han, Ju Fan, Chengliang Chai, Nan Tang, Guoliang Li, and Xiaoyong Du. "Cost-effective in-context learning for entity resolution: A design space exploration." In 2024 IEEE 40th International Conference on Data Engineering (ICDE), pp. 3696-3709. IEEE, 2024. https://ieeexplore.ieee.org/abstract/document/10597751
Published in VLDB 2025 (Accepted), 2025
This paper proposes AutoPrep, a multi-agent framework for data preparation.
Recommended citation: Meihao Fan, Ju Fan, Nan Tang, Lei Cao, Guoliang Li, Xiaoyong Du. "AutoPrep: Natural Language Question-Aware Data Preparation with a Multi-Agent Framework." VLDB 2025 (Accepted). https://arxiv.org/abs/2412.10422
Published in SIGMOD 2026 (Accepted), 2026
This paper proposes Reward-SQL, a framework for improving Text-to-SQL reasoning using process-supervised reward models.
Recommended citation: Yuxin Zhang, Meihao Fan, Ju Fan, Mingyang Yi, Yuyu Luo, Jian Tan, Guoliang Li. "Reward-SQL: Boosting Text-to-SQL via Stepwise Execution-Aware Reasoning and Process-Supervised Rewards." SIGMOD 2026 (Accepted).
Published in VLDB 2026 (Under Review), 2026
This paper proposes DeepPrep, an LLM-powered agentic system for autonomous data preparation.
Recommended citation: Meihao Fan, Ju Fan, Yuxin Zhang, Shaolei Zhang, Xiaoyong Du, Jie Song, Peng Li, Fuxin Jiang, Tieying Zhang, Jianjun Chen. "DeepPrep: An LLM-Powered Agentic System for Autonomous Data Preparation." VLDB 2026 (Under Review).
Published in VLDB 2026 (Accepted), 2026
This paper proposes TACO, a benchmark for Open-Domain Text-to-SQL.
Recommended citation: Chao Deng, Ju Fan, Yuyu Luo, Qinliang Xue, Meihao Fan, Yuxin Zhang, Min Zhang, Xiaofeng Jia, Jing Zhang, Xiaoyong Du. "TACO: A Benchmark for Open-Domain Text-to-SQL with Ambiguous and Cross-Database Queries." VLDB 2026 (Accepted).
Published in ICML 2026 (Under Review), 2026
This paper explores agentic large language models for autonomous data science.
Recommended citation: Anonymous Authors (incl. **Meihao Fan**). "DeepAnalyze: Agentic Large Language Models for Autonomous Data Science." ICML 2026 (Under Review).
Published in ICML 2026 (Under Review), 2026
This paper introduces CODA-BENCH to evaluate code agents on data-intensive tasks.
Recommended citation: Anonymous Authors (incl. **Meihao Fan**). "CODA-BENCH: Can Code Agents Handle Data-Intensive Tasks?" ICML 2026 (Under Review).
The open-source R package NPRED is used to identify the meaningful predictors to the response from a large set of potential predictors.
The open-source software WASP is used for system modeling and prediction.
The open-source software WQM is used for post-processing numerical weather prediction.
Generate synthetic time series from commonly used statistical models, including linear, nonlinear and chaotic systems.