Home Tech News AI Paper Summary Large Generative Graph Models (LGGMs): A New Class of Graph Generative Model...

Large Generative Graph Models (LGGMs): A New Class of Graph Generative Model Trained on a Large Corpus of Graphs

https://arxiv.org/abs/2406.05109

Large Generative Models (LGMs) like GPT, Stable Diffusion, Sora, and Suno have recently made remarkable strides in creating creative and meaningful content, greatly boosting the efficiency of real-world applications. Unlike earlier models like Bert/Bart in Natural Language Processing (NLP) and Unet in Image Segmentation, which were trained on small datasets from specific areas and for narrow tasks, the success of these LGMs comes from their extensive training on well-curated data from a wide range of fields. Given the tremendous success of LGMs in other domains and the potential practical uses of graph generative models, it is natural to ask: Can we develop large generative models for graph-structured data?

This paper discusses two existing methods for generating content. First, Large Generative Models (LGMs) have recently achieved great success in generating meaningful content for various tasks across multiple fields. For example, in Natural Language Processing (NLP), large language models trained to predict the next word can generate human-like text for tasks such as question answering and language translation. Second, Graph Generative Models focus on creating realistic graphs to model relationships in real-world data. These models are used in applications like generating molecular structures with desirable properties and creating subtle adversarial attacks. 

Researchers from Vanderbilt University, the University of Michigan, Adobe Research, and Intel Labs have introduced LARGE GRAPH GENERATIVE MODEL (LGGM), a new class of graph generative model that is trained on a large corpus of graphs from 13 distinct domains. Pre-trained LGGM outperforms other graph generative models in zero-shot generative capability and can be easily fine-tuned with graphs from specific fields, showing better performance than those directly trained from scratch. LGGM can generate graphs given text prompts, such as the description of the network name and domain, and network statistics.

The ability to generate Text-to-Graph helps users to have detailed control over created graphs. Further, training LGGM needs a large, well-organized corpus of graphs from various fields. Graphs are selected from the Network Repository across 13 different fields covering a wide variety of real-world situations, including Facebook (FB), Animal Social (ASN), Email, Web, Road, Power, Chemical (CHEM), etc. Many real-world graphs contain thousands or even millions of nodes and edges. However, advanced diffusion models like DiGress and GDSS can only handle networks with a few hundred nodes. To address this, subgraphs are sampled from certain domains to manage scalability challenges.

The fine-tuned LGGM is compared with DiGress trained directly on each domain to show the practical usage of LGGM in generating graphs for real-world deployment. In most of the domains, LGGM shows better generative performance on the same graphs for training due to more knowledge used during the pre-training phase. This advantage is even more noticeable when fewer graphs are available. It is useful, especially in graph-generative applications that involve semi-supervised settings, like generating anomaly detection software and designing drugs. In these cases, the relevant graphs make up only 0.05%-0.5% and 0.01% of all potential candidates, respectively.

In conclusion, researchers have proposed LGGM, a new class of graph generative model that is trained on over 5,000 graphs sourced from 13 distinct domains from the well-known Network Repository. LGGM outperforms other graph-generative models in zero-shot generative capability and can be easily fine-tuned with graphs from specific fields. It can also generate Text-to-Graph. Similar to LGMs in other fields, LGGMs do not specialize in generating graphs for specific domains. Therefore, a future direction is to evaluate their practical usefulness in application-oriented ways, such as producing higher-quality generated graphs for better data augmentation.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our 44k+ ML SubReddit

🐝 🐝 Join the Fastest Growing AI Research Newsletter...

Thank You 🙌

X
Exit mobile version