The 'mbd AI Pipeline

MBD Data Schema

What are Interactions?

Interactions in the MBD system refer to various ways users engage with each other or with content (items) on a social media platform. These interactions can be broadly classified into two categories:

1. User-to-User Interactions Examples:

Below are examples of user-to-user interactions. These interactions can be expanded based on the social media platform.

Message: This involves sending direct messages between users, facilitating private conversations.
Follow: This action allows a user to follow another user's profile to see their updates and posts.
Bookmark (Favourite): Users can save other users' profiles for quick access later.
Block: This action prevents one user from interacting with or viewing the content of another user.

2. User-to-Item Interactions Examples:

Below are examples of user-to-item interactions. These interactions can be expanded based on the social media platform.

Like: Users can express their appreciation or approval of an item by liking it.
Bookmark (Favourite): Users can save items (posts) to view them later.
Upvote: Users can vote an item up, indicating its value or quality.
Downvote: Conversely, users can vote an item down if they find it unhelpful or inappropriate.

What is an Item?

An item refers to user-generated content on a social media platform. Items can be in the form of text, images, audio, or video posts.

What is a User?

A user is a profile on a social media platform. This profile can either be a standard social media profile or a profile associated with a wallet address.

AI Pipeline for MBD AI Models

Standardize

In the standardization step, we focus on cleaning and preparing the dataset for further analysis. This involves extracting contextual actions (extracting social media platform specific information and applying generic terms for the relevant data points) to understand user behavior and applying rule-based filtering to ensure data quality. For instance, items must contain more than 20 characters and be part of an item whitelist, while duplicate entries are removed to prevent redundancy. Similarly, users must have more than 10 interactions and be included in a user whitelist, with duplicates also being eliminated. Additionally, we perform wallet matching to link profiles with their respective wallet addresses and implement cross-protocol standardization to unify user data from different social media platforms into a single, coherent schema.

📘

The rule-based filters we apply can vary depending on the developer's or platform's requirements.

Enrich Media

The media enrichment step involves analyzing and enhancing media content to extract valuable insights. This process starts with classifying content across various media types, including images, videos, audio, and text. We generate media embeddings, which are vector representations of media content. These embeddings are used to create media features that provide a detailed understanding of the content, such as identifying topics, sentiment, emotions, and moderation needs. By enriching the media data, we lay the groundwork for deeper analysis and more accurate predictions.

Understand Graph

In the graph understanding step, we extract and analyze relational information from the enriched media content. This involves classifying content again to capture relationships between users and items. We create features that represent these relationships, enabling us to detect patterns such as bot networks and trending topics. By understanding the graph structure of interactions, we can gain insights into user behavior and network dynamics, which are crucial for making informed predictions.

Predict

The prediction step involves training and testing machine learning models to forecast user-to-user and user-to-item interactions. We use a subset of the standardized and enriched data to train our models, ensuring they can accurately capture patterns and trends. These models are then evaluated against future test data to benchmark their performance. By continuously training, testing, and refining our models, we aim to improve their predictive accuracy, enabling us to make reliable forecasts about user behavior and interactions on social media platforms.

AI Pipeline Results for MBD AI Models

1. User & Item Resolution

User Profiles: Social media user profiles are gathered, and each profile is assigned an MBD user ID. A detailed profile is created for each user, which includes their interactions and behaviors on the platform. This is the user resolution use case.
Item Profiles: Social media posts, referred to as items, are collected. Each item is given an MBD item ID, and a detailed profile is created for it. This is the item resolution use case.

2. User & Item Moderation

📘

User moderation is currently based on the media analysis step. Behavior moderation (graph-based moderation) is not available yet, but it is coming soon.

User Moderation: Behaviour moderation is applied, including bot detection and impersonation detection. AI models analyse user behaviour patterns and assign labels to profiles. The outcome determines whether a user is a spammer or a bot.
Item Moderation: Media moderation is applied, including topic detection, emotion detection, sentiment detection, and spam/misinformation detection. AI labels are assigned to items based on these analyses, helping to maintain content quality on the platform.

3. User & Item Discovery

User Discovery: User profile information is used to create user media embeddings, incorporating the user's profile information and their last 100 item interactions. By analysing these embeddings, the system identifies data trends and generates statistics about user behaviour. Similar users are identified based on shared interactions, such as messaging, following, bookmarking, blocking, liking, upvoting, and downvoting similar items or users.
Item Discovery: Item profiles are used to create item media embeddings independent of user profiles. The system generates statistics based on how users interact with items, such as messaging, following, bookmarking, or blocking others. These insights help in creating item segments and discovering similar items based on specific user and item interactions.

4. User & Item Personalisation

The final step is about personalising the user experience based on the insights gained from the previous steps:

User Personalisation: The system uses discovered similar users to make action predictions. It recommends users to other users, predicting actions such as messaging, following, bookmarking, or blocking based on MBD predictions. This personalised recommendation helps enhance user engagement on the platform.
Item Personalization: After creating item segments and identifying similar items, the system recommends items to users. These recommendations are based on predicted actions like liking, bookmarking, upvoting, or downvoting items. This personalised content suggestion helps users discover items they are likely to engage with, improving their overall experience on the platform.