An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Paper-reading notes: ViT
November 3, 2025 | 1851 words | Author: Tan Ke

A Tutorial on Bayesian Optimization

Paper-reading notes: A Tutorial on Bayesian Optimization
November 1, 2025 | 3591 words | Author: Tan Ke

CrossNet++: Cross-Scale Large-Parallax Warping for Reference-Based Super-Resolution

Paper-reading notes: CrossNet++: Cross-Scale Large-Parallax Warping for Reference-Based Super-Resolution
October 29, 2025 | 1433 words | Author: Tan Ke

xLSTM: Extended Long Short-Term Memory

Paper-reading notes: xLSTM Extended Long Short-Term Memory
October 28, 2025 | 1394 words | Author: Tan Ke

RWKV: Reinventing RNNs for the Transformer Era

Paper-reading notes: RWKV: Reinventing RNNs for the Transformer Era
October 27, 2025 | 1499 words | Author: Tan Ke

Mastering the game of Go with MCTS and Deep Neural Networks

Paper-reading notes: Mastering the game of Go with MCTS and Deep Neural Networks
October 24, 2025 | 2246 words | Author: Tan Ke

CrossNet: An End-to-end Reference-based Super Resolution Network using Cross-scale Warping

Paper-reading notes: CrossNet: An End-to-end Reference-based Super Resolution Network using Cross-scale Warping
October 21, 2025 | 1976 words | Author: Tan Ke

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Paper-reading notes: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
October 20, 2025 | 314 words | Author: Tan Ke

Learning‑based light field imaging

Paper-reading notes: Learning‑based light field imaging
October 20, 2025 | 6550 words | Author: Tan Ke

A Survey of RAG

This post is primarily based on the survey “Retrieval-Augmented Generation for AI-Generated Content: A Survey”. I presents retrieval-augmented generation (RAG) in six parts: background, method, enhancement, applications, outlook, and takeaways. Background In recent years, we’ve seen a rapid surge in Artificial Intelligence Generated Content (AIGC), driven by large generative models that can produce text, code, images and even videos (Zhao et al. 2024). For text and code, widely used examples including GPT-style models and Anthropic’s Claude family (Achiam et al. 2023, Anthropic 2024). For images, modern systems are often powered by diffusion-based text-image generation, including latent diffusion models (Ramesh et al. 2021, Rombach et al. 2022). For video, OpenAI’s Sora is a prominent example of large-scale text-to-video generation (OpenAI 2024). ...

October 19, 2025 | 8549 words | Author: Tan Ke