How you can Construct a Calorie Advisor App Utilizing GenAI?

Introduction

Synthetic Intelligence has many use circumstances, and a few of the finest ones are within the Well being Business. It may well actually assist folks keep a more healthy life. With the growing growth in generative AI, sure functions are made as of late with much less complexity. One very helpful software that may be constructed is the Calorie Advisor App. On this article, we’ll solely take a look at this, impressed by caring for our well being. We will probably be constructing a easy Calorie Advisor App the place we are able to enter the pictures of the meals, and the app will assist us calculate the energy of every merchandise current within the meals. This challenge is part of NutriGen, specializing in well being via AI.

Studying Goal

The App we will probably be creating on this article will probably be primarily based on primary Immediate engineering and picture processing methods.
We will probably be utilizing Google Gemini Professional Imaginative and prescient API for our use case.
Then, we are going to create the code’s construction, the place we are going to carry out Picture Processing and Immediate Engineering. Lastly, we are going to work on the Consumer Interface utilizing Streamlit.
After that, we are going to deploy our app to the Hugging Face Platform for Free.
We may also see a few of the issues we are going to face within the output the place Gemini fails to depict a meals merchandise and offers the unsuitable calorie rely for that meals. We may also talk about totally different options for this downside.

Pre-Requisites

Let’s begin with implementing our challenge, however earlier than that, please guarantee you have got a primary understanding of generative AI and LLMs. It’s okay if you recognize little or no as a result of, on this article, we will probably be implementing issues from scratch.

For Important Python Immediate Engineering, a primary understanding of Generative AI and familiarity with Google Gemini is required. Moreover, primary data of Streamlit, Github, and Hugging Face libraries is important. Familiarity with libraries corresponding to PIL for picture preprocessing functions can be useful.

This text was printed as part of the Information Science Blogathon.

Undertaking Pipeline

On this article, we will probably be engaged on constructing an AI assistant who assists nutritionists and people in making knowledgeable choices about their meals selections and sustaining a wholesome life-style.

The circulation will probably be like this: enter picture -> picture processing -> immediate engineering -> remaining perform calling to get the output of the enter picture of the meals. This can be a transient overview of how we are going to strategy this downside assertion.

Overview of Gemini Professional Imaginative and prescient

Gemini Professional is a multimodal LLM constructed by Google. It was educated to be multimodal from the bottom up. It may well carry out nicely on varied duties, together with picture captioning, classification, summarisation, question-answering, and so on. One of many fascinating info about it’s that it makes use of our well-known Transformer Decoder Structure. It was educated on a number of varieties of information, decreasing the complexity of fixing multimodal inputs and offering high quality outputs.

Step1: Creating the Digital Setting

Making a digital atmosphere is an efficient observe to isolate our challenge and its dependencies such that they don’t coincide with others, and we are able to all the time have totally different variations of libraries we want in several digital environments. So, we are going to create a digital atmosphere for the challenge now. To do that, comply with the talked about steps beneath:

Create an Empty folder on the desktop for the challenge.
Open this folder in VS Code.
Open the terminal.

Write the next command:

pip set up virtualenv
python -m venv genai_project

You should utilize the next command should you’re getting sa et execution coverage error:

Set-ExecutionPolicy RemoteSigned -Scope Course of

Now we have to activate our digital atmosphere, for that use the next command:

.genai_projectScriptsactivate

We have now efficiently created our digital atmosphere.

Step Create Digital Setting in Google Colab

We will additionally create our Digital Setting in Google Colab; right here’s the step-by-step process to try this:

Create a New Colab Pocket book
Use the beneath instructions step-by-step

!which python
!python --version
#to verify if python is put in or not

%env PYTHONPATH=
# setting python path atmosphere variable in empty worth making certain that python
# will not seek for modules and packages in extra listing. It helps
# in avoiding conflicts or unintended module loading.

!pip set up virtualenv

# create digital atmosphere 
!virtualenv genai_project

!wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

#This may assist obtain the miniconda installer script which is used to create
# and handle digital environments in python

!chmod +x Miniconda3-latest-Linux-x86_64.sh
# this command is making our mini conda installer script executable inside
# the colab atmosphere.

!./Miniconda3-latest-Linux-x86_64.sh -b -f -p /usr/native
# that is used to run miniconda installer script and 
# specify the trail the place miniconda must be put in

!conda set up -q -y --prefix /usr/native python=3.8 ujson
#this can assist set up ujson and python 3.8 set up in our venv.

import sys
sys.path.append('/usr/native/lib/python3.8/site-packages/')
#it should enable python to find and import modules from a venv listing

import os
os.environ['CONDA_PREFIX'] = '/usr/native/envs/myenv'

# used to activate miniconda enviornment

!python --version
#checks the model of python throughout the activated miniconda atmosphere

Therefore, we additionally created our digital atmosphere in Google Colab. Now, let’s verify and see how we are able to make a primary .py file there.

!supply myenv/bin/activate
#activating the digital atmosphere

!echo "print('Hiya, world!')" >> my_script.py
# writing code utilizing echo and saving this code in my_script.py file

!python my_script.py
#working my_script.py file

This may print Hiya World for us within the output. So, that’s it. That was all about working with Digital Environments in Google Colab. Now, let’s proceed with the challenge.

Step2: Importing Needed Libraries

import streamlit as st
import google.generativeaias genai 
import os 
from dotenv import load_dotenv
load_dotenv()
from PIL import Picture

If you’re having hassle importing any of the above libraries, you’ll be able to all the time use the command “pip set up library_name” to put in it.

We’re utilizing the Streamlit library to create the fundamental person interface. The person will be capable to add a picture and get the outputs primarily based on that picture.

We use Google Generative to get the LLM and analyze the picture to get the calorie rely item-wise in our meals.

Picture is getting used to carry out some primary picture preprocessing.

Step3: Establishing the API Key

Create a brand new .env file in the identical listing and retailer your API key. You may get the Google Gemini API key from Google MakerSuite.

Step4: Response Generator Perform

Right here, we are going to create a response generator perform. Let’s break it down step-by-step:

Firstly, we used genes. Configure to configure the API we created from the Google MakerSuite Web site. Then, we made the perform get_gemini_response, which takes in 2 enter parameters: the enter immediate and the picture. That is the first perform that may return the output in textual content.

genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))

def get_gemini_response(input_prompt, picture):

    mannequin = genai.GenerativeModel('gemini-pro-vision')

    response = mannequin.generate_content([input_prompt, image[0]])

    return response

Right here, we’re utilizing the ‘Gemini-pro-vision’ mannequin as a result of it’s multimodal. After calling our mannequin from the genie.GenerativeModel dependency, we’re simply passing in our immediate and the picture information to the mannequin. Lastly, primarily based on the directions supplied within the immediate and the picture information we fed, the mannequin will return the output within the type of textual content that represents the calorie rely of various meals gadgets current within the picture.

Step5: Picture Preprocessing

This perform checks if the uploaded_file parameter is None, which means the person has uploaded a file. If a file has been uploaded, the code proceeds to learn the file content material into bytes utilizing the getvalue() technique of the uploaded_file object. This may return the uploaded file’s uncooked bytes.

The bytes information obtained from the uploaded file is saved in a dictionary format beneath the key-value pair “mime_type” and “information.” The “mime_type” key shops the uploaded file’s MIME kind, which signifies the kind of content material (e.g., picture/jpeg, picture/png). The “information” key shops the uploaded file’s uncooked bytes.

The picture information is then saved in an inventory named image_parts, which accommodates a dictionary with the uploaded file’s MIME kind and information.

def input_image_setup(uploaded_file):
    if uploaded_file isnotNone:
        #Learn the file into bytes
        bytes_data = uploaded_file.getvalue()
        image_parts = [
            {
                "mime_type":uploaded_file.type, 
                "data":bytes_data
            }
        ]
        return image_parts
    else:
        elevate FileNotFoundError("No file uploaded")

Step6: Creating the UI

So, lastly, it’s time to create the person interface for our challenge. As talked about earlier than, we will probably be utilizing the Streamlit library to jot down the code for the entrance finish.

## initialising the streamlit app
st.set_page_config(page_title="Energy Advisor App")
st.header("Energy Advisor App")
uploaded_file = st.file_uploader("Select a picture...", kind=["jpg", "jpeg", "png"])
picture = ""
if uploaded_file isnotNone:
    picture = Picture.open(uploaded_file)
    st.picture(picture, caption="Uploaded Picture", use_column_width=True)
submit = st.button("Inform me concerning the complete energy")

Initially, we arrange the web page configuration utilizing set_page_config and gave the app a title. Then, we created a header and added a file uploader field the place customers can add photos. St. Picture reveals the picture that the person uploaded to the UI. Ultimately, there’s a submit button, after which we are going to get the outputs from our giant language mannequin, Gemini Professional Imaginative and prescient.

Step7: Writing the System Immediate

Now’s the time to be inventive. Right here, we are going to create our enter immediate, asking the mannequin to behave as an professional nutritionist. It’s not needed to make use of the immediate beneath; you may also present your customized immediate. We’re asking our mannequin to behave a sure manner for now. Based mostly on the enter picture of the meals supplied, we’re asking our mannequin to learn that picture information and generate the output, which is able to give us the calorie rely of the meals gadgets current within the picture and supply a judgment of whether or not the meals is wholesome or unhealthy. If the meals is dangerous, we ask it to provide extra nutritious options to the meals gadgets in our picture. You’ll be able to customise it extra based on your wants and get a wonderful solution to maintain observe of your well being.

Typically it may not capable of learn the picture information correctly, we are going to talk about options relating to this additionally on the finish of this text.

input_prompt = """

You're an professional nutritionist the place it's essential see the meals gadgets from the 
picture and calculate the overall energy, additionally give the main points of all 
the meals gadgets with their respective calorie rely within the beneath fomat.

        1. Merchandise 1 - no of energy

        2. Merchandise 2 - no of energy

        ----

        ----

Lastly you may also point out whether or not the meals is wholesome or not and likewise point out 
the share break up ratio of carbohydrates, fat, fibers, sugar, protein and 
different essential issues required in our weight-reduction plan. If you happen to discover that meals isn't wholesome 
then it's essential to present some various wholesome meals gadgets that person can have 
in weight-reduction plan.

"""
if submit:

    image_data = input_image_setup(uploaded_file)

    response = get_gemini_response(input_prompt, image_data)

    st.header("The Response is: ")

    st.write(response)

Lastly, we’re checking that if the person clicks the Submit button, we are going to get the picture information from the

input_image_setup perform we created earlier. Then, we cross our enter immediate and this picture information to the get_gemini_response perform we created earlier. We name all of the capabilities we created earlier to get the ultimate output saved in response.

Step8: Deploying the App on Hugging Face

Now’s the time for deployment. Let’s start.

Will clarify the only solution to deploy this app that we created. There are two choices that we are able to look into if we need to deploy our app: one is Streamlit Share, and the opposite one is Hugging Face. Right here, we are going to use Hugging Face for the deployment; you’ll be able to strive exploring deployment on Streamlit Share iFaceu if you’d like. Right here’s the reference hyperlink for that – Deployment on Streamlit Share

First, let’s rapidly create the necessities.txt file we want for the deployment.

Open the terminal and run the beneath command to create a necessities.txt file.

pip freeze > necessities.txt1plainText

This may create a brand new textual content file named necessities. All of the challenge dependencies will probably be accessible there. If this causes an error, it’s okay. You’ll be able to all the time create a brand new textual content file in your working listing and duplicate and paste the necessities.txt file from the GitHub hyperlink I’ll present subsequent.

Now, just remember to have these recordsdata useful (as a result of that’s what we want for the deployment):

app.py
.env (for the API credentials)
necessities.txt

If you happen to don’t have one, take all these recordsdata and create an account on the cuddling face. Then, create a brand new area and add the recordsdata there. That’s all. Your app will probably be routinely deployed this fashion. Additionally, you will be capable to see how the deployment is happening in real-time. If some error happens, you’ll be able to all the time determine it out with the straightforward interface and, after all, the cuddling face neighborhood, which has loads of content material on resolving some widespread bugs throughout deployment.

After a while, it is possible for you to to see the app working. Woo hoo! We have now lastly created and deployed our calorie predictor app. Congratulations!!, You’ll be able to share the working hyperlink of the app with the family and friends you simply constructed.

Right here’s the working hyperlink to the app that we simply created – The Alorcalorieisor App

Let’s check our app by offering an enter picture to it:

Earlier than:

After:

Full Undertaking GitHub Hyperlink

Right here’s the entire github repository hyperlink that features supply code and different useful info relating to the challenge.

You’ll be able to clone the repository and customise it based on your necessities. Attempt to be extra inventive and clear in your immediate, as this can give your mannequin extra energy to generate right and correct outputs.

Scope of Enchancment

Issues that may happen within the outputs generated by the mannequin and their options:

Typically, there could possibly be conditions the place you’ll not get the proper output from the mannequin. This will likely occur as a result of the mannequin was not capable of predict the picture appropriately. For instance, should you give enter photos of your meals and your meals merchandise accommodates pickles, then our mannequin would possibly think about it one thing else. That is the first concern right here.

One solution to deal with that is via efficient immediate engineering methods, like few-shot immediate engineering, the place you’ll be able to feed the mannequin with examples, after which it should generate the outputs primarily based on the learnings from these examples and the immediate you supplied.
One other resolution that may be thought of right here is creating our customized information and fine-tuning it. We will create information containing a picture of the meals merchandise in a single column and an outline of the meals gadgets current within the different column. This may assist our mannequin be taught the underlying patterns and predict the gadgets appropriately within the picture supplied. Thus, getting extra right outputs of the calorie rely for the images of the meals is important.
We will take it additional by asking the person about his/her diet objectives and asking the mannequin to generate outputs primarily based on that. (This manner, we will tailor the outputs generated by the mannequin and provides extra user-specific outputs.)

Conclusion

We’ve delved into the sensible software of Generative AI in healthcare, specializing in the creation of the Calorie Advisor App. This challenge showcases the potential of AI to help people in making knowledgeable choices about their meals selections and sustaining a wholesome life-style. From establishing our surroundings to implementing picture processing and immediate engineering methods, we’ve coated the important steps. The app’s deployment on Hugging Face demonstrates its accessibility to a wider viewers. Challenges like picture recognition inaccuracies have been addressed with options corresponding to efficient immediate engineering. As we conclude, the Calorie Advisor App stands as a testomony to the transformative energy of Generative AI in selling well-being.

Key Takeaways

We have now mentioned so much to date, Beginning with the challenge pipeline after which a primary introduction to the massive language mannequin Gemini Professional Imaginative and prescient.
Then, we began with the hands-on implementation. We created our digital atmosphere and API key from Google MakerSuite.
Then, we carried out all our coding within the created digital atmosphere. Additional, we mentioned the way to deploy the app on a number of platforms, corresponding to Hugging Face and Streamlit Share.
Other than that, we thought of the doable issues that may happen, and mentioned soluFaces to these issues.
Therefore, it was enjoyable engaged on this challenge. Thanks for staying until the tip of this text; I hope you bought to be taught one thing new.

Ceaselessly Requested Questions

Q1. What’s the Google Gemini Professional Imaginative and prescient Mannequin?

Google developed Gemini Professional Imaginative and prescient, a famend LLM recognized for its multimodal capabilities. It performs duties like picture captioning, era, and summarization. Customers can create an API key on the MakerSuite Web site to entry Gemini Professional Imaginative and prescient.

Q2. How can Generative AI be utilized to the Healthcare/Diet area?

A. Generative AI has loads of potential for fixing real-world issues. Among the methods it may be utilized to the well being/diet area are that it may well assist docs give medication prescriptions primarily based on signs and act as a diet advisor, the place customers can get wholesome suggestions for his or her diets.

Q3. How does immediate engineering resolve the Generative AIuse case?

A. Immediate engineering is a vital talent to grasp as of late. The most effective place to be taught trompt engineering from primary to superior is right here – https://www.promptingguide.ai/

This autumn. How you can enhance the mannequin’s capacity to generate extra right outputs?

A. To extend the mannequin’s capacity to generate extra right outputs, we are able to use the next techniques: Efficient Prompting, Superb Tuning, and Retrieval-Augmented Era (RAG).

The media proven on this article isn’t owned by Analytics Vidhya and is used on the Creator’s discretion.