France Labs Datafari Enterprise Search

How Enterprise Search can help you for GDPR compliance

Gastbeitrag von France Labs

Datafari, as an Enterprise Search solution, has an overall visibility over all of the knowledge bases of an organization. As such, it is a good entry point to check where PII (Personally Identifiable Information) are stored.

Indeed, as part of the GDPR requirements, any organization must maintain a list of where PII data are stored. But as soon as the knowledge base grows too much, it is impossible to manually maintain such a list. Distributing this task over the different departments of the organization is a good start, but it has its limits, for instance due to the possible misinterpretation from colleagues about what PII are.

This is where Enterprise Search solutions come in handy: because they go through all of the internal documents and data, it is simple to add detection mechanisms to automate the generation of a list of documents that are potential candidates as PII holders.

Such a feature is feasible for free, using the open source version of Datafari, aka Datafari Community Edition. We presented during the Open Source Experience event in Paris end of 2023, a demo and a walkthrough on how to set it up. Thanks for this tutorial, you can have an end-to-end systems that detects regular expressions (think phone numbers, social security card numbers etc) as well as entities via Machine Learning (people names, organizations for instance) using a dedicated Spacy server leveraging the Transformers models. You can now do it yourself following this link that details the necessary steps: using Datafari for GDPR PII inventory.

Enterprise Search Datafari
Sie haben Fragen zu Datafari oder benötigen Unterstützung?
Portrait Cedric Ulmer, France Labs

Cedric Ulmer

Cedric ist CEO und Mitbegründer von France Labs, einem auf Open-Source-Suchmaschinen spezialisierten Startup und Hersteller von Datafari, einer Open-Source-Suchlösung für Unternehmen. Er leitet das Unternehmen und kümmert sich um die Aspekte Innovation und Marketing. Was das Ökosystem betrifft, so leitet er die Open-Source-Business-Community im größten Verband für IT-Unternehmen an der Côte d'Azur. Er unterrichtet seit 4 Jahren Unternehmertum am Data Science European MsC des EIT. Davor war er zehn Jahre lang bei SAP in der Forschungsabteilung tätig. Cedric hat das französische Diplom der Grande Ecole von Telecom SudParis und das Eurecom-Zertifikat.