Database Research Cluster

University of Salzburg

Salzburg | Website

Open for Collaboration

Short Description

The server infrastructure includes a network of 18 powerful server systems of the company Supermicro with multi-core processors based on the x86-64 architecture for calculations and experimental setups in the field of database and algorithm research. The nodes are connected with 40Gbit/s Ethernet, 2x 56Gbit/s (10 nodes) and 3x 200Gbit/s Infiniband (3 nodes), and can be used specifically for hardware-related programming and development with RDMA.

Systems with 96GB to 2TB RAM and 12 to 64 CPU cores are available. For the storage of research data, fast storage systems in RAID arrays with multiple redundancy and backups at different locations are available. A CI system supports the rapid development of research software.

The system is designed to fulfil different requirement profiles. Memory- and computationally intensive applications as well as distributed calculations can be performed on multiple nodes. In addition, applications and operating systems can be operated in isolation and can be abstracted using virtualization and containers.

For access to the server infrastructure, research computers based on the x86-64 architecture, equipped with various operating systems (Windows, Linux, MacOS) are available.

Contact Person

Prof. DI Dr. Nikolaus Augsten

Research Services

Systems and algorithms for processing data
Storing and querying large data
Approximate query processing
Spatio-temporal database systems
GIS enabled databases
Interconnected nodes with 2x56Gbit/s Infiniband (per computing node) resp. 2x200Gbit/s and 40Gbit/s ethernet technology
Direct Infiniband hardware access for RDMA-enabled software and services
Abstraction and Isolation of running systems using virtualization and container technologies
Database-as-a-service (DH-Infra Project)
Test and Development Infrastructure-as-a-service (DH-Infra Project)

Methods & Expertise for Research Infrastructure

The research infrastructure is divided into two parts: A research infrastructure and a service infrastructure.
The research infrastructure is designed for research in the field of data engineering with the aim of solving efficiency problems in data processing. In this branch of research, new algorithms are developed, implemented, and empirically evaluated. The empirical evaluation requires precise runtime measurements, measurements of memory consumption, as well as network traffic. This usually requires exclusive and physical access (bare metal) to individual servers or a cluster of servers. The usage is characterized by frequently changing configurations to be able to run tests under different conditions.
The service infrastructure focusses on providing virtualized environment for database and infrastructure as service for the digital humanities in Salzburg and Austria. The design of the service infrastructure ensures that we can tailor the hardware and software setting to the needs of serviced projects while still providing high level of security.

The research infrastructure is technically professionally administered and offers supporting services for the execution of experiments, e.g. versioned storage of experimental setups and large experimental data, E2EE for sensitive research data, fully automatic provisioning of cluster nodes, as well as databases for measurement results. Researchers are supported and advised by the technical staff during the setup of their experiments. From a scientific point of view, there is a wealth of experience in the design and empirical evaluation of single-core, multi-core, parallel shared-nothing, and distributed algorithms.

Terms of Use

Please contact us via science.plus@plus.ac.at, or contact the responsible person for this section, mentioned in the contact field

Cooperation Partners

Humboldt-University Berlin
Johannes Gutenberg University Mainz (JGU)
Technical University Munich
Celonis SE, Munich
Findologic GmbH, Salzburg
Salzburg Research GmbH

Reference Projects

DH-Infra 2023-2026 Assoz.-Prof. Martin Schäler, BMBF

DESQ - Declarative and Efficient Similarity Queries
2022 - 2026
Univ. Prof. Dipl.-Ing. Nikolaus Augsten, PhD
Fonds zur Förderung der wissenschaftlichen Forschung: FWF
https://dbresearch.uni-salzburg.at/projects/desq/index.php

BOSS 1.0: Biblical Online Synopsis Salzburg 1.0
2021 - 2024
Univ. Prof. Nikolaus Augsten, Assoz.-Prof. Martin Schäler, Univ. Prof. Kristin De Troyer
Land Salzburg
https://dbresearch.uni-salzburg.at/projects/

Fast and Flexible Tree Edit Distance (FFTED) Projekt
2017-2021
Univ. Prof. Dipl.-Ing. Nikolaus Augsten, PhD
Fonds zur Förderung der wissenschaftlichen Forschung: FWF
https://ffted.dbresearch.uni-salzburg.at/

FWF Doctoral College GIScience
2015-2019
Nikolaus Augsten, Euro Beinat, Stefan Lang, Franz Neubauer, Anette Bartsch, Thomas Blaschke, Michael Leitner, Josef Strobl
Fonds zur Förderung der wissenschaftlichen Forschung: FWF
https://dk-giscience.zgis.net/

Synonyme für Suchmaschinen
2018-2019
Univ. Prof. Dipl.-Ing. Nikolaus Augsten, PhD
Findologic GmbH, Österreichische Forschungsförderungsgesellschaft mbH

Reference Publications

Manuel Widmoser, Daniel Kocher, Nikolaus Augsten: Scalable Distributed Inverted List Indexes in Disaggregated Memory
Proc. ACM Manag. Data (2024)

Daniel Ulrich Schmitt, Daniel Kocher, Nikolaus Augsten, Willi Mann, Alexander Miller: A Two-Level Signature Scheme for Stable Set Similarity Joins. Proc. VLDB Endow. 16(11): 2686-2698 (2023)
https://doi.org/10.14778/3611479.3611480

Oksana Dolmatova (Univ. Zurich), Nikolaus Augsten (Univ. Salzburg), Michael H. Böhlen (Univ. Zurich):
A Relational Matrix Algebra and its Implementation in a Column Store. SIGMOD Conference 2020: 2573-2587
https://doi.org/10.1145/3318464.3389747

Thomas Hütter (Univ. Salzburg), Maximilian H. Ganser (Univ. Salzburg), Manuel Kocher (Univ. Salzburg), Merima Halkic (Univ. Salzburg), Sabine Agatha (Univ. Salzburg), Nikolaus Augsten (Univ. Salzburg):
DeSignate: detecting signature characters in gene sequence alignments for taxon diagnoses. BMC Bioinform. 21(1): 151 (2020)
https://doi.org/10.1186/s12859-020-3498-6

Set Similarity Joins on MapReduce: An Experimental Survey
2018
Fabian Fier (Humboldt-Universität zu Berlin), Nikolaus Augsten (Univ. Salzburg), Panagiotis Bouros (Johannes Gutenberg University Mainz), Ulf Leser (Humboldt-Universität zu Berlin), Johann-Christoph Freytag (Humboldt-Universität zu Berlin) PVLDB 11(10): 1110-1122
https://doi.org/10.14778/3231751.3231760

An Empirical Evaluation of Set Similarity Join Techniques
2016
Willi Mann (Univ. Salzburg), Nikolaus Augsten (Univ. Salzburg), Panagiotis Bouros (Aarhus Univ., Denmark) PVLDB 9(9): 636-647
https://doi.org/10.14778/2947618.2947620

On-the-fly token similarity joins in relational databases
2014
Nikolaus Augsten (Univ. Salzburg), Armando Miraglia (VU Amsterdam), Thomas Neumann (TU München), Alfons Kemper (TU München) SIGMOD Conference 2014: 1495-1506
https://doi.org/10.1145/2588555.2610530

JEDI: These aren't the JSON documents you're looking for?
2022
Thomas Hütter, Nikolaus Augsten, Christoph M. Kirsch, Michael J. Carey, Chen Li
SIGMOD Conference 2022: 1584-1597
https://doi.org/10.1145/3514221.3517850

Scaling Density-Based Clustering to Large Collections of Sets
2021
Daniel Kocher, Nikolaus Augsten, Willi Mann
EDBT Conference 2021: 109-120
https://doi.org/10.5441/002/edbt.2021.11

Swellfish privacy: Supporting time-dependent relevance for continuous differential privacy
2022
Christine Tex, Martin Schäler, Klemens Böhm
Information Systems Journal 109
https://doi.org/10.1016/j.is.2022.102079

Set Similarity Joins on MapReduce: An Experimental Survey.
2018
Fabian Fier, Nikolaus Augsten, Panagiotis Bouros, Ulf Leser, Johann-Christoph Freytag
PVLDB 11(10): 1110-1122
https://doi.org/10.14778/3231751.3231760

Tree edit distance: Robust and memory-efficient.
2016
Mateusz Pawlik, Nikolaus Augsten
Inf. Syst. 56: 157-173
https://doi.org/10.1016/j.is.2015.08.004

An Empirical Evaluation of Set Similarity Join Techniques.
2016
Willi Mann, Nikolaus Augsten, Panagiotis Bouros
PVLDB 9(9): 636-647
https://doi.org/10.14778/2947618.2947620

On-the-fly token similarity joins in relational databases.
2014
Nikolaus Augsten, Armando Miraglia, Thomas Neumann, Alfons Kemper
SIGMOD Conference 2014: 1495-1506
https://doi.org/10.1145/2588555.2610530