About the Book¶
This book is a collection of data wrangling problems and solutions tailored for business school students. It is designed as a cookbook for business school students who have a particular data wrangling problem in hand and want to get the job done quickly. Meanwhile, I also tried my best to explain each line of code so that in addition to copy-paste, you can still learn useful techniques :)
Currently, this book has two parts:
Part I: R with data.table
A Highly Efficient Event Study Code. Event study is used almost everywhere in business research. I offer a super efficient event study implementation in R. The core part has only 30+ lines, and it’s 5x to 10x faster than the Python version offered by WRDS.
40 Practices on Stock Data Processing (Recommended). A collection of 40+ problems that you’ll frequently encounter when dealing with stock data. You can use this chapter either as a cookbook or as a learning-by-doing textbook of
data.table. The chapter is an English version of this Github repo, which is authored by me and Rui Li (Zhejiang Financial College). Special thanks to renkun-ken who provides the original problem set.Useful Functions. A collection of useful & efficient functions. Most of them are faster rewrites of functions offered in other packages. For example, I offered a faster function
drawdownto compute the largest drawdowns of an asset, which is 1.89x faster thantable.Drawdownsfrom thePerformanceAnalyticspackage.
Part II: Python with Polars
40 Practices on Stock Data Processing with Polars. A collection of 40+ problems that you’ll frequently encounter when dealing with stock data, implemented using Polars. This chapter serves as a Python companion to the R/data.table version, demonstrating how to leverage Polars for fast and efficient data processing.
If you have any questions, please contact me at: ross dot zhu at outlook dot com.