Using Google Gemini API to build an LLM Model
Google Gemini API Key , Following Gemini AI’s recent announcement, Google has unveiled API access for its Gemini models. The current offering includes API access to Gemini Pro, encompassing both text-only and text-and-vision models. This release is noteworthy as, up until now, Google has not integrated visual capabilities into Bard, which operates solely as a text-only model. The availability of this API key enables users to immediately test Gemini’s multimodal capabilities on their local computers. In this guide, we’ll explore how to access and utilize the Gemini API.
Google has introduced Gemini, a cutting-edge model that has revitalized the functionality of Bard. With Gemini, users can now obtain nearly flawless answers to their queries by presenting a combination of images, audio, and text.
This tutorial aims to provide insights into the Gemini API and guide you through its setup on your machine. Additionally, we will delve into different Python API functions, covering aspects such as text generation and image comprehension.
What is Gemini?
Introducing the AI Models for Google Gemini API Key
Gemini, a recent AI innovation, emerges from collaborative efforts within various Google teams, including Google Research and Google DeepMind. Uniquely designed as a multimodal model, Gemini exhibits the capability to comprehend and interact with diverse data types such as text, code, audio, images, and video.
Distinguished as the most advanced and extensive AI model developed by Google to date, Gemini prioritizes adaptability, operating seamlessly across a spectrum of systems—from expansive data centers to portable mobile devices. This adaptability holds the promise of transforming the landscape for businesses and developers, offering new possibilities for building and scaling AI applications.
To facilitate exploration and utilization of Gemini’s capabilities, Google has introduced the Gemini API, which is available for free access. Gemini Ultra demonstrates state-of-the-art performance, surpassing benchmarks set by GPT-4 on various metrics. Notably, it stands out as the first model to outperform human experts on the Massive Multitask Language Understanding benchmark, showcasing its advanced understanding and problem-solving capabilities.
In the wake of the release of ChatGPT and OpenAI’s GPT models, coupled with their collaboration with Microsoft, Google seemed to recede from the limelight in the AI space. The subsequent year witnessed minimal major developments from Google, except for the PaLM API, which failed to captivate widespread attention. Suddenly, however, Google unveiled Gemini—a series of foundational models. Shortly after the Gemini launch, Google introduced the Gemini API, which we will explore in this guide. Finally, we will leverage the API to construct a basic chatbot.
Beginning with Google Gemini API Key
To commence our journey with Gemini, the initial step involves acquiring a free Google API Key, granting access to the functionalities of Gemini. This complimentary API Key is accessible through MakerSuite at Google, and the process of obtaining it is outlined in detail in this article. It is recommended to follow the step-by-step instructions provided in the article to create an account and secure the API Key, facilitating seamless integration with Gemini.
Google Gemini API Key Setting Up:
To utilize the API, the initial step involves obtaining an API key, which can be acquired from the following link:
How to Access and Use Gemini API for Free:
- After acquiring the API key, click on the “Get an API key” button and proceed to “Create API key in a new project.”
- Copy the API key and establish it as an environment variable. For users employing Deepnote, the process is streamlined. Simply navigate to the integration settings, scroll down, and select the environment variables section.
This ensures the proper setup for accessing and utilizing the Gemini API, allowing for a seamless and efficient integration process.
Installing Dependencies:
Our initial step is to install the necessary dependencies outlined below:
!pip install google-generativeai langchain-google-genai streamlit
- google-generativeai Library: The google-generativeai library is a toolkit developed by Google for seamless interaction with Google’s models, including PaLM and Gemini Pro.
- langchain-google-genai Library: The langchain-google-genai library is designed to simplify the process of working with various large language models and developing applications with them. In this context, we are specifically installing the langchain library, which supports the new Google Gemini LLMs.
- Streamlit Web Framework: The third essential dependency is the Streamlit web framework. We will utilize Streamlit to craft a chat interface reminiscent of ChatGPT, seamlessly integrating Gemini and Streamlit for a dynamic user experience.
Configuring API Key and Initializing Gemini Model
Let’s delve into the coding process.
To begin, we’ll load the Google API Key as illustrated below:
import os
import google.generativeai as genai
os.environ['GOOGLE_API_KEY'] = "Your API Key"
genai.configure(api_key = os.environ['GOOGLE_API_KEY'])
- First, we store the obtained API key from MakerSuite in an environment variable labeled “GOOGLE_API_KEY.”
- Subsequently, we import the configure class from Google’s genai library.
- We then assign the API Key stored in the environment variable to the
api_key
variable using the configure class. - This sets the stage for working with the Gemini models seamlessly.
With these steps, we are now ready to engage with the capabilities offered by the Gemini models.
ValueError: The `response.parts` quick accessor only works for a single candidate, but none were returned. Check the `response.prompt_feedback` to see if the prompt was blocked.
Understanding Candidates in Gemini LLM:
In the context of the error, the term “candidate” refers to a potential response. When the Gemini LLM generates a response, it does so in the form of candidates. In this instance, the absence of a candidate indicates that the LLM did not generate any response. The error message prompts us to examine the response.prompt_feedback for additional diagnostic information. To delve deeper into the diagnosis, we will execute the following:
print(response.prompt_feedback)
Output:
In the image above, the block reason is attributed to safety concerns. Further down, safety ratings are provided for four distinct categories. These ratings align with the prompt/query submitted to the Gemini LLM, serving as feedback for the provided input. Two notable red flags emerge, particularly in the Harassment and Danger categories.
- Harassment Category: The elevated probability in this category can be attributed to the mention of “stalking” in the prompt.
- Danger Category: The high probability here is linked to the presence of “gunpowder” in the prompt.
The .prompt_feedback function proves invaluable in discerning the issues with the prompt, shedding light on why the Gemini LLM refrained from providing a response.