The ethical AI playbook 2.0

Part 1: Agreement on data usage

The ethical AI playbook 2.0

An ethical playbook for artificial intelligence for the real estate and construction sector compiled by the Building Information Foundation RTS and A-INS Group. The purpose of the updated playbook is to support actors in the built environment in utilizing artificial intelligence, and to promote sustainable new technologies in the sector.

 “What needs to be agreed on, and how will the data be used?”

Data agreements, confidentiality, and intellectual property rights

The EU High-Level Expert Group published ethical guidelines for trustworthy AI in 2018. The guidelines emphasize that the quality of the datasets used is critically important for the ethical functioning of AI systems, as biased training data can reflect and reinforce distortions in society. AI is often used to improve efficiency and performance, yet at the same time organizations want to protect their own data and avoid violating others’ rights. For these reasons, using and training AI always creates a need to agree on data, whether a company in the AECO sector wants to use AI in its business, an individual uses an AI service, or a company acquires AI solutions for its own use.

In this section, we describe the key issues that must be agreed upon when working with AI in the AECO sector. We provide a brief overview of intellectual property rights and focus on data use, agreements, and matters that organizations in the AECO sector must consider when developing and using AI. An organization that uses AI ethically respects intellectual property rights, handles data agreements appropriately, and ensures confidentiality.

According to European Commission Ethics guidelines for trustworthy AI (2018), trustworthy AI has three requirements:

  1. It must be lawful.
  2. It must be ethical.
  3. It must be both technically and socially robust.

Data plays a central role in AI development. Large amounts of data are needed to train AI algorithms, users provide data to AI during use, and algorithms in turn generate multiple new forms of data from it.

The current state of data agreements in the AECO sector

Recent studies, for example Regona et al. (2022), indicate that the construction industry has created standards that make data sharing between companies difficult due to questions related to intellectual property rights. Companies often lack a clear framework as well as guidance on how to implement these technologies on construction sites. In addition, concerns arise regarding the security, reliable storage, efficiency, and interpretation of large volumes of data used on construction sites.

Currently the general understanding is that standard contracts in the AECO sector do not comprehensively address the use of AI-related data
, creating a clear need in future. AECO Sector stakeholders also express a desire to negotiate data use more transparently and to have greater influence over how their data is used.

Going forward, the AECO sector’s contracting culture must consider data agreements related to AI, ensuring ethical data use, openness, and transparency. For example, when project-specific data is used in AI systems, contracts must define:

  • how the data may be used in AI services
  • how long the data may be used
  • for what purposes the data may be used
  • who is responsible for the use of the data

Closely related to these requirements the organisations should agree on compensation for data use, data-protection measures, and the allocation of responsibilities. Applicable terms must be formulated on a case-by-case basis and incorporated into standard contracts, since data usage needs can vary between different applications.

Intellectual property rights and different types of data

Intellectual property rights (IPR) protect intangible assets such as creative works, inventions, brands, and valuable information

Copyrights protect creative work produced by humans that is expressed in the form of a literary or artistic work, according to the Finnish tietopolitiikan käsikirja (2024). To qualify for copyright protection, a work must be the author’s own original creation. Whether these conditions are met is assessed on a case-by-case basis using the concept of the threshold of originality, which has developed differently for different types of works. In Finland, only a natural person can hold copyright to a work. A work created purely by AI such as an image, music, or video is not protected by copyright.

According to Finnish patent and registration office industrial property rights protect, manage, and commercialize the results of development work. Their holder has the right to prohibit use and to grant licenses that allow others to commercially exploit the protected subject matter. Such subject matter may include, for example, technology, a trademark, or a design.

Industrial property rights are important tools particularly in product development and marketing.

Confidentiality refers to protecting AI applications and the data they use so that only authorized individuals or parties have access to it. This applies especially to AI input data and the data processed within the system, both of which require strict information security practices and technologies. A breach of confidentiality can result in data leaks that jeopardize privacy and business operations.

In addition to confidentiality and copyright, AI development involves broader legal considerations, such as the application of patents to AI technologies, liability for incorrect decisions or damages, and compliance with legislation.

Copyright neighboring rights covers for example database rights. A database rights, also known as a sui generis right, protects the investments required to create a database.

Patents protect technological innovations and inventions.

Utility model rights provide a simpler form of protection for practical, application-oriented inventions.

Design rights protect the appearance and design features of products, which is important, for example, in industrial design.

Trademark rights protect a company’s brand, such as logos and symbols, helping it distinguish itself from competitors in the marketplace.

Trade secrets protect a company’s valuable and confidential information, such as customer databases, strategies, and technical processes.

The types of data involved in AI use can be divided into three main categories: training data, input data, and output data. Managing and understanding output data carries important ethical and legal implications, particularly concerning intellectual property rights and the responsible use of information. There is an important ethical and legal dimension to managing and understanding print data, especially from the perspective of intellectual property rights and the responsible use of information. Output data reflects the final result produced by the AI and provides a reference point for assessing, for example, whether copyright criteria are met and whether the content involves protected intellectual property.

Training data refers to the information used to develop an AI model. This data may be copied and transformed into various formats to meet training requirements.

Input data refers to the information provided to the AI during its use. These data may also involve material copied from other sources, and it is not always clear how or where the data will be used.

Output data refers to the information produced by the AI. However, the origin or reliability of this data is not always fully known, which may raise questions about its suitability and appropriateness for use.

Confidentiality and copyrights

Copyrights in AI development in the AECO sector

AI systems in the AECO sector often make use of large datasets that may include design documents, construction methods, product information, building usage data, or companies’ internal processes. It is therefore essential to have clear guidance on data ownership and intellectual property rights. Companies must ensure that AI algorithms do not violate copyrights or use protected content without permission. This requires obtaining the necessary rights to the data used for AI training and respecting the intellectual property of documents and data related to design, construction, and building management.

During AI development and deployment the Key copyright questions are:

defining ownership

    (who holds the copyrights)

usage rights

    (who may use the data)

how the rights of original works must be respected

For example, during the design phase of a building, AI tools may be used to generate design alternatives, optimize plans, or speed up material selection.

In design activities within the AECO sector, an important point is that AI-generated content is not necessarily protected by copyright unless there has been significant human involvement in its creation. Liang et al. (2024) emphasize that designers must ensure they have the rights to use AI-generated material. When AI tools make use of existing designs or design data, appropriate licensing agreements must be in place. From a copyright perspective, it is necessary to develop ways to identify the human role in content creation. If a designer uses AI as a tool, they may theoretically obtain copyright if their own contribution is sufficiently independent and original. For this reason, designers should keep records of the information they input into the AI (such as prompts) and document the outputs to demonstrate proper conduct.

In the AECO sector, managing copyrights also poses challenges for data use, specifically whether actors have sufficient rights to copy, modify, and share the data, and who ultimately owns the rights to AI outputs. Copyright exceptions that might help address some of these concerns remain largely untested, and the most significant one is not easily applicable to commercial activity. According to guidelines by Aalto University (accessed 10 December 2024), the text and data mining exception in the EU DSM Directive (2019/790) and Finnish copyright law allows the use of works as training, validation, and test data for AI models, as well as their use as input for AI systems.

AI development has progressed rapidly, and at the time of writing this guide, there is still little case law available, especially in relation to the AECO sector. Copyright issues in AI remain uncertain. The application of exceptions is unclear, and the only way to increase legal certainty is to agree comprehensively on data use. Major industry players have recognized this as well such as OpenAI, which has signed licensing agreements with Reddit to use its content for AI development.

Information security in AI development in the AECO sector

Emaminejad et al. (2022) emphasize that privacy and information security are essential in AI adoption, particularly in the AECO sector, where sensitive project data is handled. Privacy issues arise when AI systems process personal or sensitive information. Protecting AI systems from unauthorized access is critical for building trust. The Deloitte AI Institute (2023) notes that AI use requires careful balancing between privacy and ethical considerations to ensure sensitive data is not compromised. This also applies to construction projects that use AI.

As in other industries, privacy and information security must be prioritized in construction. AI systems may process sensitive data about companies and individuals involved in a project, including client information, architectural designs, financial data, project schedules, budgets, and contractor information. To protect such data, AI systems must be designed with strong encryption, access control, and confidentiality agreements. Strict information-governance policies are necessary to ensure that only authorized personnel can access confidential data handled by the AI. Secure storage and access management are needed to prevent leaks of sensitive information. Ensuring AI-related data protection is a continuous and evolving task for organizations in the AECO sector, safeguarding data from unauthorized access and cyber threats. Continuous oversight of data disclosure is equally important to prevent leakage of internally generated operational data.

You can learn more about AI risk management in Part 2: Risk Management and Information Security.

Confidentiality in AI development in the AECO sector

The Deloitte AI Institute (2023) recommends that AI systems be designed to protect data using strong encryption, access control, and confidentiality agreements. Organizations developing AI must adopt strict AI information-governance policies to ensure that only authorized individuals have access to confidential data and training datasets processed by AI systems.

AI data-governance policies must also comply with data protection regulations (GDPR) to ensure that project data remains secure and personal data is processed appropriately. Transparency and clear consent mechanisms are important so that all stakeholders understand how their data is being used within AI systems.

AI systems must be used ethically, and data should be collected only to the extent necessary for specific purposes. There is concern that users often lack sufficient understanding of data collection and use, combined with the common requirement to accept service terms. Research indicates that privacy notices and AI service terms should, by default, be human-friendly and easy to understand. Users should be informed about what data is collected, how it is used, and that their data may be used for the service provider’s own analysis and commercial purposes. Auditing AI data-governance practices is an effective way to ensure that confidentiality is maintained as intended.


An example of agreement items that aims data utilization for the AECO sector

Mapping the data used in AI development between different parties is the first step in identifying potential risks associated with AI use and mitigating them. When agreeing on data processing in relation to AI, it is important first to outline the foundational principles and develop an understanding of: what kinds of problems may arise in data processing, and at what stage these issues may materialize in the AI-related data lifecycle.

Where to begin?

When agreeing on the use of data for AI in construction processes, a three-step approach can be applied:

  1. Begin by defining who owns the rights to the data used with AI. This includes training data, input data, and output data and who is responsible for ensuring that these do not infringe the rights of third parties.
  2. Next, agree on how this data may be used and ensure that sufficient rights are granted for the intended purposes. This often includes rights to use, copy, modify, and share the data, and may also include agreeing on sublicensing rights.
  3. Finally, agree on data storage, retention periods, management, and confidentiality. The contract should specify what types of AI systems will be used, clarify how they process the data, and whether they may disclose the data further. If the data contains personal information, all such processing must comply with data protection legislation.

All the above points should be formulated in a way that fits either a specific use case or can be included in standard contract terms if shared consensus is reached.

The list above is an example and not yet exhaustive. It does not cover all legal aspects but provides a starting framework for stakeholders in the AECO sector when drafting agreements.

Future Outlook

Data needed for the use and development of AI requires a new culture of agreements and trust

There are still no clear (at the time of writing this playbook) widely applicable rules or legal precedents concerning copyright in relation to AI. As a starting point, AI systems must be used ethically and organizations should maintain AI systems transparently. A key requirement is to minimize the use of personal data in AI training and ensure that clear usage agreements and terms are in place.

Organizations in the AECO sector will need to implement continuous monitoring of data protection compliance, both as developers and users of AI systems. Robust data-governance policies are required to ensure both technically and operationally ethical behaviour in AI systems.

In the future, the only way to ensure the implementation of copyrights is to agree openly on the use of information. This begins with identifying the challenges or obstacles that may arise when providing data for use in an AI system, or when developing an AI system using external data. Standard contract terms must clearly describe the flow of data and comprehensively set out agreements for at least the following aspects: how the data will be used, who may use it, compensation, usage rights, data management, responsibility in cases of misuse, to whom data may be disclosed, and for what purposes it may be used.

Consider at least these!

AI systems in the AECO sector require a high level of confidentiality, which can be developed and maintained through at least the following six principles (adapted from the Deloitte AI Institute (2023) and Philipsson (2024)):

  • Agree comprehensively on data use: including compensation, usage rights, data management, and responsibility in cases of misuse.
  • Minimize the use of personal data in training datasets and ensure the protection of individual rights as well as GDPR compliance.
  • Ensure transparency of the AI system.
  • Use clear, human-friendly agreements that provide transparent and understandable usage-right terms.
  • Ensure the legal basis for both the AI system and the data it processes.
  • Maintain confidential data handling at all stages.
  • Keep records of the data you provide to the AI system and the data you are permitted to use.
  • Ensure that the AI system’s behaviour is explainable (explainability).

Read more

High-level expert group on artificial intelligence, Ethics Guidelines for Trustworthy AI, 2019, link: https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai , used 20.9.2024

In Finnish: Poikola Antti, Markkanen Jouni, Parkkila Janne, Tietopolitiikan käsikirja (2024), link https://tietopolitiikka.fi/2024/11/14/kasikirja-tekoalysta-paatoksentekijoille/  used 10.12.2024

Regona, Massimo & Yigitcanlar, Tan & Xia, Bo & Li, R.Y.M.. (2022). Opportunities and Adoption Challenges of AI in the Construction Industry: A PRISMA Review. Journal of Open Innovation Technology Market and Complexity, link: https://www.sciencedirect.com/science/article/pii/S219985312201054X%20-%20bb0180

Ci-Jyun Liang, Thai-Hoa Le, Youngjib Ham, Bharadwaj R.K. Mantha, Marvin H. Cheng, Jacob J. Lin, Ethics of artificial intelligence and robotics in the architecture, engineering, and construction industry, Automation in Construction, Volume 162, 2024, 105369, ISSN 0926-5805

In Finnish: Aalto University, ohjeistus tekoälyjärjestelmä ja tekijänoikeus, link: https://www.aalto.fi/fi/palvelut/tekoalyjarjestelmat-ja-tekijanoikeus , used 10.12.2024

Deloitte AI institute,  The legal implications of Generative AI, 2023, link: https://www.deloitte.com/content/dam/assets-shared/docs/services/legal/2023/dttl-legal-perspective-the-legal-implications-of-generative-ai.pdf, used 10.12.2024

Emaminejad, Newsha & North, Alexa & Akhavian, Reza. (2022). Trust in AI and Implications for AEC Research: A Literature Analysis. 295-303. 10.1061/9780784483893.037. Link: https://www.researchgate.net/publication/360826085_Trust_in_AI_and_Implications_for_AEC_Research_A_Literature_Analysis  

Philipsson Fredrik, 2024, Building Trust in AI Ethics, link: https://redresscompliance.com/building-trust-in-ai-a-guide-to-ethical-considerations/, used 10.12.2024