AI Modules FAQ
Finnish Coming Soon
AI Modules FAQ
1. Data Management and Privacy
Data Sources: What data sources will be used for training and operating the AI summarization module? Are these sources compliant with relevant data protection laws?
We are using data from several publicly available tender sources. We are using both the Mercell generated tender metadata as well as documents included in the tender. NOTE: We are using foundational models (LLMs like OpenAI’s GPT-4o) and are using it in a way that our data is explicitly opted out for training. Hence, we are not training any models.
Data Storage: Where will the data be stored, and what measures are in place to protect this data from unauthorized access?
We are storing the data in a dedicated AWS account which follows the same general policies as other Mercell AWS accounts. Access is possible in two ways:
Through AWS IAM users/roles having the right permission set. At time of writing this is only available for developers working on the system
Through our REST API while having the right user credentials. This is secured through the oauth2 compliant client credentials flow only. Credentials need to be created by the developer team.
2. Transparency and Explainability
Model Explainability: How will the AI model's decision-making process be documented and made understandable to users?
The decision process of the AI model is not transparent and this is also impossible to achieve. We are using proprietary foundational models which do not document how they have been trained or how they generate results. We can provide transparency for the process that is using this model and how the model is used. The model itself is a black box.
Auditability: What mechanisms are in place for auditing the AI module’s outputs and ensuring they align with expected standards?
We have several quality control mechanisms:
Human evaluation of samples. We check our summaries with expert knowledge from within Mercell for accuracy and completeness (precision and recall).
Machine evaluation of samples. We use a set of different benchmark scores to evaluate the summary output against a set of reference summaries. This makes it possible to quantify the summary accuracy and completeness.
Machine evaluation at scale. We use a the same set of benchmark scores and acceptability criteria and evaluate each summary before it is being presented to the customer/user.
User evaluation. The end-users can provide feedback on the summaries, both binary (is it a good or bad summary) as well as qualitative (what is missing, what can be improved). This feedback will be incorporated in both our human as well as our machine evaluation of results.
Out of these evaluations, 1 and 2 are currently in place. 3 and 4 are being planned.
3. Bias and Fairness
Bias Mitigation: What steps are being taken to identify and mitigate any biases in the AI model’s training data and outputs?
We are not using the foundational models’ internal knowledge, but instead are using it as a way to process the documents. We instruct the models to only use the information available in the documents. This leaves minimal room for bias outside of the possible bias encaptured in the document themselves.
Fairness Checks: How will we ensure that the summarization is fair and non-discriminatory across different types of tender documents?
The model is evaluating all the documents and will look for the most accurate information. Since the output of the model is not 100% deterministic it cannot be guaranteed that it will be 100% the same all the time, but in principle we are processing factual documents and retrieving factual information out of all of them. No mechanism is in place to provide special treatment to certain types of documents.
4. Performance and Accuracy
Evaluation Metrics: What metrics will be used to evaluate the performance and accuracy of the AI summaries?
See answer at “2. Auditability”
Continuous Monitoring: How will we monitor the AI system's performance over time to ensure consistent quality and accuracy?
See answer at “2. Auditability” point 3 and 4.
5. User Interaction and Feedback
User Feedback Mechanism: How will we collect and incorporate user feedback to improve the AI summaries?
There will be two ways for users to provide feedback. A thumbs up/down button (or equivalent) and an input field to provide qualitative inputs in summary quality. This feedback will be evaluated by the AI team and used to improve the product.
Error Reporting: What processes will be in place for users to report errors or inaccuracies in the summaries?
See previous answer. On top of that they can use the same feedback/support system Mercell has in place currently.