The Open Call 3 NGI Searchers are announced


Jan 18 2024

OC3_outcome.png

Date: Thursday 18 January 2024
Time: 10:00-12:00
Place: KickOff Web Conference

The following NGI Search project beneficiaries have been introduced. To listen to a podcast, please use the reader below: 

OKLLM

The increasing availability of LLMs is opening up a new era in online content discovery. However, LLMs are computationally intensive. Worse still, the elements generated by the AI may contain biases or false information.
As part of this research, the project aims to fill these gaps through knowledge distillation, knowledge graph generation and verification, bias detection and transfer learning.
Listen to the podcast presentation:

podcast.svg

Loading the video player...

AALLM

AIF-Sticker_NGI_Search.png

This project aims to build an evidence collection pipeline to scrutinise Large Language Model (L.L.M.) powered search engines transforming the internet search and discovery landscape. Proposed for NGI Search's 2nd open call, the pipeline has been prototyped in the meantime, combining a prompt forgery, an experiment scheduler, browser automation and scraping modules, and an interface to explore and annotate the evidence.
This proposal aims to develop the tool for stable and feature completion, allowing open sourcing and enabling more researchers to conduct independent audits of BingChat, Google Bard, and ChatGPT-browsing mode. The tool is already being leveraged to scrutinise the current elections in Germany and Switzerland. Initial results are compelling - and include a state-of-the-art + auditing methodology of LLM-based search engines, plus some labeling guidelines (here applied to Microsoft Copilot/Tiktok/Youtube in the context of EU elections as an example) to flag (dis)information.
Listen to the podcast presentation:

podcast.svg

Loading the video player...

SCION Browser

SCION is a path-aware inter-domain network architecture that provides applications and users opportunities to optimise data transport over the Internet. This project aims to integrate SCION into the Brave web browser to enable path-aware retrieval of web resources.
However, finding the most suitable paths is a challenging problem. This browser will use PANAPI to automatically find the corresponding paths, optimising application- and user-based metrics such as overall page load time, latency, bandwidth, privacy, and CO2 footprint according to the application's needs and user's preferences set in the browser. Additionally, it will also integrate support for RHINE into the Brave browser.
Listen to the podcast presentation:

podcast.svg

Loading the video player...

MetaVision

StickerMetaVision.png

The Metaverse provides a new metric for artists to be searchable, showcase/sell their artwork to a global audience with minimal barriers, and for consumers to have an unbounded experience not limited by physical space/entry fees.
Yet, a twofold problem remains: The actual creation of a (unique) 3D art gallery is a barrier for artists to showcase in a Metaverse setting, and trust (relating to digital content sharing) generates hesitation for posting artwork online.
Thus, this project will provide Free and Open-Source Software (FOSS) for artists to create Metaverse-ready art galleries procedurally. And it will investigate the integration of non-fungible tokens (NFTs) to ensure privacy and trust are upheld within the created emerging 3D web for art.
Listen to the podcast presentation:

podcast.svg

Loading the video player...

Trust4AI

StickerTrust4AI.png

While the benefits of Large Language Models (L.L.M.s) generative AI products are high in many aspects, they also pose key concerns regarding safety and trust. Initiatives to regulate the use of AI—most notably the European Union (EU.) AI Act—are expected to become crucial, forcing companies to commercialise AI products within the EU to ensure a certain level of trustworthiness.
However, checking an AI system's compliance with existing regulations is purely manual, making the process tedious, time-consuming, and unreliable.
This project aims to explore the potential of metamorphic testing in automating this task, making the process more reliable, affordable, and systematic.
Listen to the podcast presentation:

podcast.svg

Loading the video player...

Spare Cores

StickerSpareCores.png

This project helps DevOps, DS, ML, AI, ETL, AV, and other engineering teams to find optimal instances for their batch jobs (e.g. "8 CPU cores, 64 GB of RAM, and a TPU needed in an EU datacenter to train ML models for 6 hours") by providing:
📌 Open-source tools, database schemas and documentation to monitor cloud and flexible VPS/dedicated server vendors and their compute resource offerings in an innovative and genuinely comparative way, including vendor details (e.g. location, certificates, green power), compute capabilities (e.g. CPU, memory, GPU/TPU), pricing (especially of spot instances), and performance (by running task-specific benchmarks).
📌  Managed infrastructure, databases, APIs, SDKs, and web applications to make these continuously and transparently tracked data sources publicly available and comparable in a validated, unbiased, structured, and searchable manner.
📌  Helpers can quickly start and manage instances at all the supported vendors with a standardised API.

🔎 Read Gergely Daróczi (Spare Cores) interview
Listen to the podcast presentation:

podcast.svg

Loading the video player...

WASPER

StickerWASPer.png

This project seeks the support of the NGI Search Programme to help our democracy by creating a tool for online publishers and users that will help us make informed decisions about the content we want to consume.
Their primary target is AI-generated texts, commonly and increasingly used by online trolls to spread misinformation, propaganda and manipulative claims.
They will focus on two aspects to develop this tool:
📌 Design of novel internally developed taxonomy that captures a more profound diversity of the types of trolling content;
📌 Apply state-of-the-art generative architectures with multilingual capabilities to be used in its development.
Their experienced data scientists will eventually produce tools to help media (and readers) detect trolling content created by AI - a primary target of challenge 2 of the Programme.
Listen to the podcast presentation:

podcast.svg

Loading the video player...

E.V.A. Gallery

Sticker_EVA_Gallery.png

The project's main objective is to develop a new platform for searching, finding, presenting, and supporting European visual art in virtual space for various audiences, from creators to gallerists and consumers.
Their vision proposes long-term sustainability and considerable potential for further developing and utilising cutting-edge technologies such as AI, virtual reality, and blockchain.
💡 Gallerists and artists can easily create their virtual galleries by creating and selling NFTs from their art. An AI assistant, E.V.A. (abbreviation for European Visual Arts), will help you identify and find the information and art pieces you seek.
Listen to the podcast presentation:

podcast.svg

Loading the video player...

Indi

Sticker-InDi.png

InDi aims to create an open-source search engine based on explainability, anonymity, fairness, and inclusiveness. InDi will use AI + NLP to re-rank results and give textual explanations. Yet, it will also rely on a community of voluntary human reviewers and validators to manually check/ accept/reject doubtful AI re-rankings, improving the InDi knowledge base for the following queries.
A blockchain mechanism will ensure a fair and anonymous review process, encouraging also to join reviewers from discriminated minorities. To sustain the best reviewers, InDi will also award them crypto-assets, utility tokens whose extrinsic value may ramp up as the platform takes momentum.
Participants, Cagliari University and R2M, have had a close-knit partnership for years and are leaders in their respective academic and business domains.
Listen to the podcast presentation:

podcast.svg

Loading the video player...

Better Food Search

StickerPeerDB.png

Open datasets about food products and their ingredients (Open Food Facts, USDA FoodData Central, SmartLabel) and detailed information about ingredients themselves (Wikidata, Wikipedia) exist.
But while each provides ways to search its data, they propose to combine those datasets to create an LLM-powered search engine for food products. As part of the project, they will convert and combine datasets about food products and ingredients into semantic datasets.
For that, they will create a general tool to combine multiple datasets with the help of an LLM to clean the data (e.g., do entity resolution).
Once the unified dataset is made, they will use it to create an LLM-powered search engine, which will use it to support recommendations for food products based on factual data.
Listen to the podcast presentation:

podcast.svg

Loading the video player...

EU programme:  HORIZON-CL4-2021-HUMAN-01  

flagEU.svg

This project has received funding from the European Union’s Horizon Europe research and innovation programme under the grant agreement 101069364 and it is framed under Next Generation Internet Initiative.

Follow us on social media

linkedin.svgNGISearch_Logo_Icon-circle-N-rgb.svg

XWiki Enterprise 16.9.0 - Documentation