Here is a small overview of some selected public projects. Follow my current activities on Linkedin.
Open source projects
I’m maintaining or have maintained the following R packages:
parallelDist | R, C++11
The parallelDist package provides a fast parallelized alternative to R’s native ‘dist’ function to calculate distance matrices for continuous, binary, and multi-dimensional input matrices and offers a broad variety of distance functions from the ‘stats’, ‘proxy’ and ‘dtw’ R packages. For ease of use, the ‘parDist’ function extends the signature of the ‘dist’ function and uses the same parameter naming conventions as distance methods of existing R packages. Currently 39 different distance methods are supported.
ibmdbR | R, SQL
Functionality required to efficiently use R with IBM(R) Db2(R) Warehouse offerings (formerly IBM dashDB(R)) and IBM Db2 for z/OS(R) in conjunction with IBM Db2 Analytics Accelerator for z/OS. Many basic and complex R operations are pushed down into the database, which removes the main memory boundary of R and allows to make full use of parallel processing in the underlying database.
IBM Cloud SQL Query | Node.js, RabbitMQ, Spark, K8s, CI/CD, JKGBackend development of highly available microservices, powering the serverless and cloud-native IBM Cloud SQL Query service. SQL Query allows to query structured data (JSON, CSV, Parquet, etc.) stored on Cloud Object Storage (COS).
IBM Db2/dashDB Warehouse Spark integration | Java 7, Docker, Spark, RESTImplemented distributed multi-tenant Apache Spark cluster manager, which allows to spawn resource restricted Spark clusters for single users.
Data Mining Cup 2014 | R, SPSS Modeler, Python
Second place (best german team) of > 125 teams in international DMC student contest. Objective: Minimizing prediction error of a classification problem.
SAP Hana: In-database KORD association rule mining | C++11
While working as a student at SAP, I've implemented a version of the "K-optimal rule discovery" association rule mining algorithm, which is now part of the predictive analysis library (PAL) of SAP HANA.
Data Mining Cup 2013 | R, SPSS Modeler, Python
Third place of > 80 teams in international DMC student contest. Objective: Minimizing prediction error of a classification problem.
- Edouard Fouché, Alexander Eckert, and Klemens Böhm. In-database analytics with ibmdbpy. In Proceedings of the 30th International Conference on Scientific and Statistical Database Management, SSDBM '18, pages 31:1--31:4, New York, NY, USA, 2018. ACM. [DOI | http ]
- Stefanie Betz, Erik Burger, Alexander Eckert, Andreas Oberweis, Ralf Reussner, and Ralf Trunko. An approach for integrated lifecycle management for business processes and business software. In Rami Bahsoon Ivan Mistrík, Antony Tang and Judith A. Stafford, editors, Aligning Enterprise, System, and Software Architectures. IGI Global, 2012.
Battlefield Stats Viewer | Delphi
Multilingual statistics GUI app for the game "Battlefield 2" developed in my free time around 2006. More than 150.000 downloads from my project page and feature on gamespy.com.