Bibliography
Highlights
We have highlighted a shortlist of papers and books for anyone interested in the philosophy and architecture of XTDB. We consider these works to be of high value and directly relevant to the current design of XTDB.
-
Worst-case Optimal Join Algorithms, Ngo, Porat, et al. Journal of the ACM, Vol. 64, No. 3, Article 16. 2018.
-
Skew Strikes Back: New Developments in the Theory of Join Algorithms, Ngo, Re, Rudra. ACM SIGMOD Record, Vol. 42, No. 4, Pages 5-16. 2014.
-
Temporal Database Management, Jensen. 2000.
-
Developing Time-Oriented Database Applications in SQL, Snodgrass. ISBN 1-55860-436-7. 2000.
-
Multidimensional Range Search in Dynamically Balanced Trees, Tropf, Herzog. Applied Informatics, Pages 71-77. 1981.
-
Deconstructing the Database, Hickey. InfoQ. 2013.
Bibliography
The bibliography contains papers, books, and other materials which are directly relevant to XTDB and its architecture — past, present, and very-near-future.
Columnar Stores
-
The Design and Implementation of Modern Column-Oriented Database Systems, Abadi, Boncz, et al. Foundations and Trends in Databases, Vol. 5, No. 3, Pages 197–280. 2012.
-
C-Store: A Column-oriented DBMS, Stonebraker, Abadi, et al. Proceedings of the 31st VLDB Conference. 2005.
-
Mainlining Databases: SupportingFast Transactional Workloads onUniversal Columnar Data File Format, Li, Butrovich, et al. April 2020.
-
Fast Serializable Multi-Version Concurrency Controlfor Main-Memory Database Systems, Neumann, Mühlbauer, Kemper. SIGMOD’15. 2015.
-
Efficient Transaction Processing in SAP HANA Database – The End of a Column Store Myth, Sikka, Färber, et al. SIGMOD ’12. 2012.
-
A Decomposition Storage Model, Copeland, Khoshafian. ACM SIGMOD Record, Vol. 14, No. 4, Pages 268-279. May 1985.
Columnar Stores: MonetDB / VectorWise
-
Database Architecture Optimized for the new Bottleneck: Memory Access, Boncz, Manegold, Kersten. Proceedings of the 25th VLDB Conference. 1999.
-
MonetDB/X100: Hyper-Pipelining Query Execution, Boncz, Zukowski, Nes. CIDR. 2005.
-
Database Architecture Evolution: Mammals Flourished long before Dinosaurs became Extinct, Boncz, Manegold, Kersten. Proceedings VLDB Endowment, Vol. 2, Pages 1648-1653. 2009.
-
MIL Primatives for Querying a Fragmented World, Boncz, Kersten. VLDB Journal. 1999.
-
The Story of VectorWise, Boncz. Base de Donnees Avancees (BDA). 2011.
-
MonetDB: Open-source Database Technology Beyond Textbooks, Manegold.
Columnar Stores: SingleStore
-
Investigating Linux Performance with Off-CPU Flame Graphs, Reece. January 2016.
-
SingleStore Universal Storage – And Then There Was One, Hanson. September 2019.
Storage vs. Compute
-
Choosing A Cloud DBMS: Architectures and Tradeoffs, Tan, Ghanem, et al. Proceedings of the VLDB Endowment, Vol. 12, No. 12. August 2019.
-
Building a Database on S3, Brantner, Florescu, et al. 2008.
-
Building An Elastic Query Engine on Disaggregated Storage, Vuppalapati, Miron, et al. 17th USENIX Symposium on Networked Systems Design and Implementation. 2020.
-
Building an elastic query engine on disaggregated storage: Summary, Colyer. March 2020.
-
-
Socrates: The New SQL Server in the Cloud, Antonopoulos, Budovski, et al. SIGMOD '19: Proceedings of the 2019 International Conference on Management of Data, Pages 1743–1756. June 2019.
-
Eon Mode: Bringing the Vertica Columnar Database to the Cloud, Vandiver, Prasad, et al. SIGMOD '18: Proceedings of the 2018 International Conference on Management of Data, Pages 797–809. May 2018.
-
Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases, Verbitski, Gupta, et al. SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data, Pages 1041–1052. May 2017.
-
Hailstorm: Disaggregated Compute and Storage forDistributed LSM-based Databases, Bindschaedler, Goel, Zwaenepoel. ASPLOS '20: Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, Pages 301–316. March 2020.
-
Separating compute and storage: What it means, and why it’s important for databases, Storm. Adam Storm’s Weblog. January 2019.
-
The Power of Separating Cloud Compute and Cloud Storage: Why decoupling maximizes flexibility and business value. Teradata Weblog.
-
Vertica’s Architectural Direction – Separation of Compute and Storage, Bear. Vertica Weblog. June 2019.
Bi-Temporality
-
Developing Time-Oriented Database Applications in SQL, Snodgrass, ISBN 1-55860-436-7. 2000.
-
Temporal Database Management, Jensen. 2000.
-
Chapter 2: Semantics of Time-Varying Information, Jensen, Snodgrass. 2000.
-
Chapter 4: On the Semantics of “Now” in Databases, Clifford, Dyreson, et al. 2000.
-
The Consensus Glossary of Temporal Database Concepts, Christian S. Jensen, James Clifford et al. 1994
-
-
Bi-temporal Timeline Index: A Data Structure forProcessing Queries on Bi-temporal Data, Kaufmann et al. 2015 IEEE 31st International Conference on Data Engineering, April 2015.
-
Unifying Temporal Data Models via a Conceptual Model, Jensen, Snodgrass. Information Systems, Volume 19, Issue 7, October 1994, Pages 513-547.
-
A Deep Dive into Bitemporal, Woolridge. MarkLogic Weblog. 2016.
-
Kx Insights: Powering Business Decisions with Bitemporal Data, Tomczak. Kx Weblog. 2017.
-
Sign of the Times - Managing inhomogeneously bitemporal data with Datomic and Clojure, Fraenkel. May 2014.
-
ISO/IEC JTC 1/SC 32: SQL Technical Reports — Part 2: SQL Support for Time-Related Information, ISO/IEC TR 19075-2:2015(E). 2015.
-
Modification Semantics in Now-Relative Databases, Torp, Jensen, Snodgrass. Information Systems, Vol. 29, No. 8, Pages 653–683. 2004.
-
A split operator for now-relative bitemporal databases, Agesen, Böhlen, et al. Proceedings of 17th International Conference on Data Engineering. 2001.
-
The Temporal Query Language TQuel, Snodgrass. ACM Transactions on Database Systems, Vol. 12, No. 2. 1987.
Datalog
-
What you Always Wanted to Know About Datalog (And Never Dared to Ask), Ceri, Gottlob, Tanca. IEEE Trans. Knowl. Data Eng. 1989.
-
Dedalus: Datalog in Time and Space, Alvaro, Marczak, et al. Technical Report No. UCB/EECS-2009-173. December 2009.
Query Planner
-
Worst-case Optimal Join Algorithms, Ngo, Porat, et al. Journal of the ACM, Vol. 64, No. 3, Article 16. 2018.
-
Skew Strikes Back: New Developments in the Theory of Join Algorithms, Ngo, Re, Rudra. ACM SIGMOD Record, Vol. 42, No. 4, Pages 5-16. 2014.
-
Combining Worst-Case Optimal and Traditional Binary Join Processing, Freitag et al.
-
Worst-Case Optimal Graph Joins in Almost No Space, Arroyuelo, Hogan, et al. SIGMOD '21. 2021.
-
Execution Strategies for SQL Subqueries, Elhemali, Galindo-Legaria, et al. Proceedings of the ACM SIGMOD International Conference on Management of Data. 2007.
-
Translating SQL into the Relational Algebra, Bussche, Vansummeren. 2009.
-
Unnesting Arbitrary Queries, Neumann, Kemper. BTW. 2015.
Graph Representation and Traversal
-
A Demonstration of MAGiQ: Matrix Algebra Approach for Solving RDF Graph Queries, Jamour, Abdelaziz, Kalnis. Proceedings of the VLDB Endowment. August 2018.
-
Data structures for temporal graphs based on compact sequence representations, Caro, Rodriguez, Brisaboa. Information Systems 51. 2015.
Data Structures
-
SuRF: Practical Range Query Filtering with Fast Succinct Tries, Zhang, Lim, et al. SIGMOD '18: Proceedings of the 2018 International Conference on Management of Data, Pages 323–336. May 2018.
Data Structures: Trees
-
k2-Trees for Compact Web Graph Representation, Brisaboa, Ladra, Navarro. String Processing and Information Retrieval, 16th International Symposium. 2009.
-
k-d Tree. Wikipedia.
-
R* Tree. Wikipedia.
-
A Triangular Decomposition Access Method for Temporal Data - TD-tree, Stantic, Topor, et al. ADC '11: Proceedings of the Twenty-Second Australasian Database Conference, Vol. 115, Pages 113–122. 2011.
-
Fast Parallel Hypertree Decompositions in Logarithmic Recursion Depth, Gottlob, Lanzinger, et al. 2021.
-
Reference Implementation, GitHub.
-
-
Hypersuccinct Trees — New universal tree source codes for optimal compressed tree data structures, Munro, Nicholson, et al. 2021.
-
Learning Multi-dimensional Indexes, Nathan, Ding, et al. Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, Pages 985-1000. 2020.
-
The n-dimensional k-vector and its application to orthogonal range searching, Arnas, Leake, Mortari. Applied Mathematics and Computation, Vol. 372. 2020.
Data Structures: Raytracing
-
BIH (Bounding Interval Hierarchy), Gründl.
-
Instant Ray Tracing: The Bounding Interval Hierarchy, Wächter, Keller. Eurographics Symposium on Rendering. 2006.
-
Naive Ray-Tracing: A Divide-And-Conquer Approach, Mora. ACM Trans. Graph. 30, 5, Article 117. October 2011.
Data Structures: Closest Point
-
DIVIDE-AND-CONQUER IN MULTIDIMENSIONAL SPACE, Bentley, Shamos. Proceedings of the Eighth Annual ACM Symposium on Theory of Computing. 1976.
-
Fast Hierarchical Clustering and Other Applications ofDynamic Closest Pairs, Eppstein. ACM Journal of Experimental Algorithmics, Vol. 5. 2000.
-
Dynamic Generalized Closest Pair: Revisiting Eppstein’s Technique, Chan. Symposium on Simplicity in Algorithms. 2020.
Data Structures: Temporal
-
Temporal Database Management, Jensen. 2000.
-
Chapter 36: R-Tree Based Indexing of Now-Relative Bitemporal Data, Bliuj ̄ut ̇e, Jensen, et al. 2000.
-
Chapter 37: Light-Weight Indexing of General Bitemporal Data, Bliuj ̄ut ̇e, Jensen, et al. 2000.
-
-
The POINT Approach to Represent now in Bitemporal Databases, Stantic, Sattar, et al. Journal of Intelligent Information Systems, Vol. 32, Pages 297–323. 2009.
-
Designing Access Methods for Bitemporal Databases, Kumar, Tsotras, Faloutsos. IEEE Transactions on Knowledge and Data Engineering, Vol. 10, No. 1, Pages 1-20. 1998.
-
Indexing Bi-temporal Windows, Ge, Kaufmann, et al. Proceedings of the 27th International Conference on Scientific and Statistical Database Management, No. 19. 2015.
-
Querying now-relative data, Anselma, Luca, et al. Journal of Intelligent Information Systems, No. 41, Pages 285–311. 2013.
-
Comparison of Access Methods for Time-Evolving Data, Salzberg, Tsotras. ACM Computing Surveys, Vol. 31, No. 2. June 1999.
Data Structures: Z-Curves
-
Multidimensional Range Search in Dynamically Balanced Trees, Tropf, Herzog. Applied Informatics, Pages 71-77. 1981.
Relational Algebra
-
The Third Manifesto, Darwen, Date.
-
Databases, Types, and The Relational Model: The Third Manifesto, Date, Darwen. 3rd edition, Addison-Wesley, 2006 (ISBN: 0-321-39942-0).
-
An Overview and Analysis of Proposals Based on the TSQL2 Approach, Date, Darwen. 2005.
Array Programming
-
Notation as a Tool of Thought, Iverson. Communications of the ACM, Vol. 23, No. 8, Pages 444–465. August 1980.
-
Learning J: An Introduction to the J Programming Language, Stokes. 2015.
Inspiration
These resources do not necessarily reflect algorithms, datastructures, or concepts which apply directly to the current XTDB architecture. These resources have been useful for the XTDB team in the past, for one reason or another. They may: reflect past XTDB architectures, have helped onboard team members, influence our overall philosophy, or simply be something we find interesting in the field.
Data Structures
-
HINT: A Hierarchical Index for Intervals in Main Memory, Christodoulou, Bouros, Mamoulis. April 2021.
-
Mathematics of Digital Hyperspace, Kepner, Davis, et al. March 2021.
-
Data-Oriented Design and C++, Acton. CppCon (YouTube). 2014.
Design Philosophy
-
Designing Data-Intensive Applications, Kleppmann. O’Reilly, ISBN: 9781449373320. 2017.
-
Turning the Database Inside-Out, Kleppmann. StrangeLoop. 2014.
-
The Database as a Value, Hickey. InfoQ. 2012.
-
Deconstructing the Database, Hickey. InfoQ. 2013.
Miscellaneous
-
The End of an Architectural Era (It’s Time for a Complete Rewrite), Stonebraker, Hachem, et al. VLDB '07: Proceedings of the 33rd international conference on Very large data bases, Pages 1150–1160. 2007.
-
Push vs. Pull-Based Loop Fusion in Query Engines, Shaikhha, Dashti, Koch. Journal of Functional Programming, Vol. 28. 2018.
-
Is Kafka a Database?, Kleppmann. Kafka Summit London (YouTube). 2019.
JUXT Resources
Our own talks and articles are listed in reverse-chronological order. Although older resources are still relevant to the philosophy and design of XTDB, newer resources will always provide a better undersatnding of implementation details.
Talks and Articles
-
The Strength of the Record, Deobald. xtdb.com. 2021.
-
XTDB SQL: Query your Datalog database with SQL, Pither. XTDB Weblog. 2020.
-
The Value of Bitemporality, Pither. JUXT Weblog. 2019.
-
The Design and Implementation of a Bitemporal DBMS, Råberg. ClojuTRE (YouTube). September 2019.
-
Temporal Databases for Stream Architectures, Taylor, Pither. StrangeLoop (YouTube). September 2019.
-
The XTDB of Bitemporality, Pither. Clojure/north (YouTube). May 2019.
-
defn Podcast #49: XTDB with Jon and Jeremy, Pither, Taylor. Soundcloud. May 2019.
Prior Art
-
datascript-mapdb: Durable Datascript backed by MapDB, Råberg. GitHub (archived). 2016.
-
Eyvind: distributed rule engine, Råberg. GitHub (archived). 2016.
-
crux-datascript: Replicate XTDB data into DataScript, Taylor. Github.