Kakizu: Turn your sketches into beautiful AI generated art using Cloudflare!

ashish - Apr 15 - - Dev Community

This is a submission for the Cloudflare AI Challenge.

What I Built

Hey there, creative minds! I've cooked up something super cool called Kakizu that's gonna blow your artistic socks off. Get ready to turn those quirky doodles and sketches into mind-blowing AI-generated artworks with just a few clicks!

Kakizu is all about letting your imagination run wild on its rad drawing canvas. Whether you're a seasoned artist or just someone who loves doodling during boring meetings or classes, this is your chance to let those crayon scribbles shine. Once you've worked your sketch magic, get ready for the real showstopper. With the power of Cloudflare's AI awesomeness, Kakizu will zap your doodle into a jaw-dropping AI-generated masterpiece. It's like having your own personal art genie, but way cooler (and no lamps required).

Demo

Demo Video -

Live App Link - https://kakizu.vercel.app/

My Code

All the code for this project is open source on github. Do leave a ⭐ if you liked the project!

Journey

After thinking a lot about what I should build for this challenge, I came up with this idea where you can just scribble for fun and then AI generates an art based on that scribble!

I decided to use Next.js for my web app framework and shadcn/ui for ui framework. Initially I thought of using the img2img model from cloudflare but that model didn't quite work as well. I realized that I needed to give the model a detailed prompt. Thus I decided to go with 3 separate models - one for identifying the objects in image, one for creating the prompt based on the previous caption, and the last one for generating the final image.

Working with cloudflare AI models was easy and fun thanks to the amazing documentation. I used the REST API endpoints for integrating the models with my API.

Creating the sketch pad for the app was one of the toughest tasks. I researched a lot to find the most appropriate library for the task and finally decided to use react-konva. Once the user draws something and clicks on generate - the sketch is exported as an image and then fed to the model from cloudflare to generate the final image.

Multiple Models and/or Triple Task Types

My project qualifies for the Multiple Model Per Task Criteria. I have used the following three models to create AI art from sketches -

  1. @cf/unum/uform-gen2-qwen-500m - for identifying the objects in the user made sketch and captioning it.
  2. @cf/meta/llama-2-7b-chat-fp16 - for creating appropriate prompt for ai image generation.
  3. @cf/lykon/dreamshaper-8-lcm - for generating the final image.
. . . . . . . . . . . . . . . . . . . . . . . . . .