Projects

Here is a small overview of some selected public projects. Follow my current activities on Linkedin.

Open source projects

R packages

I’m maintaining or have maintained the following R packages:

parallelDist | R, C++11

The parallelDist package provides a fast parallelized alternative to R’s native ‘dist’ function to calculate distance matrices for continuous, binary, and multi-dimensional input matrices and offers a broad variety of distance functions from the ‘stats’, ‘proxy’ and ‘dtw’ R packages. For ease of use, the ‘parDist’ function extends the signature of the ‘dist’ function and uses the same parameter naming conventions as distance methods of existing R packages. Currently 39 different distance methods are supported.

Git | CRAN

ibmdbR | R, SQL

Functionality required to efficiently use R with IBM(R) Db2(R) Warehouse offerings (formerly IBM dashDB(R)) and IBM Db2 for z/OS(R) in conjunction with IBM Db2 Analytics Accelerator for z/OS. Many basic and complex R operations are pushed down into the database, which removes the main memory boundary of R and allows full use of parallel processing in the underlying database.

CRAN

Past projects

IBM Cloud SQL Query | Node.js, RabbitMQ, Spark, K8s, CI/CD, JKG

Backend development of highly available microservices, powering the serverless and cloud-native IBM Cloud SQL Query service. SQL Query allows querying structured data (JSON, CSV, Parquet, etc.) stored on Cloud Object Storage (COS).

See Linkedin

IBM Db2/dashDB Warehouse Spark integration | Java 7, Docker, Spark, REST

Implemented distributed multi-tenant Apache Spark cluster manager, which allows spawning resource-restricted Spark clusters for single users.

See Linkedin

Data Mining Cup 2014 | R, SPSS Modeler, Python

Second place (best german team) of > 125 teams in international DMC student contest. Objective: Minimizing prediction error of a classification problem.

DMC page

SAP Hana: In-database KORD association rule mining | C++11

While working as a student at SAP, I implemented a version of the "K-optimal rule discovery" association rule mining algorithm, which is now part of the predictive analysis library (PAL) of SAP HANA.

Documentation

Data Mining Cup 2013 | R, SPSS Modeler, Python

Third place of > 80 teams in international DMC student contest. Objective: Minimizing prediction error of a classification problem.

DMC page

Papers

  • Edouard Fouché, Alexander Eckert, and Klemens Böhm. In-database analytics with ibmdbpy. In Proceedings of the 30th International Conference on Scientific and Statistical Database Management, SSDBM '18, pages 31:1--31:4, New York, NY, USA, 2018. ACM. [DOI | http ]
  •  
  • Stefanie Betz, Erik Burger, Alexander Eckert, Andreas Oberweis, Ralf Reussner, and Ralf Trunko. An approach for integrated lifecycle management for business processes and business software. In Rami Bahsoon Ivan Mistrík, Antony Tang and Judith A. Stafford, editors, Aligning Enterprise, System, and Software Architectures. IGI Global, 2012.

Honorable mentions

Battlefield Stats Viewer | Delphi

Multilingual statistics GUI app for the game "Battlefield 2" developed in my free time around 2006. More than 150.000 downloads from my project page and feature on gamespy.com.