Microsoft's AI-Driven Dev Tool Turns Whiteboard Sketches into Code

Microsoft introduced a Web tool driven by AI technologies and Azure cloud services that turns whiteboard sketches into HTML code for text boxes, check boxes, buttons and so on.

Aptly named Sketch2Code, the tool was introduced by a company AI program manager on the Azure dev site, Tara Jana, who said, "We hope this post helps you get started with AI and motivates you to become an AI developer."

Part of that motivation might be the company's claim that the tool can radically condense the time-consuming process of converting whiteboard designs into development code and provide instant results.

Sketch2Code uses AI computer vision technology -- providing object detection and text recognition -- to scan uploaded images and, leveraging pretrained custom models, translate sketched Web components into HMTL snippets. The Custom Vision Service is used to train models and detect drawn HTML objects, whereupon the text-recognition functionality of the service extracts the handwritten text.

"A layout algorithm uses the spatial information from all the bounding boxes of the predicted elements to generate a grid structure that accommodates all," Microsoft said. Finally, an HTML generation engine creates HTML markup code reflecting the result, using all of the gathered data.

Sketch2Code Architecture
[Click on image for larger view.] Sketch2Code Architecture (source: Microsoft).

"By combining these two pieces of information, we can generate the HTML snippets of the different elements in the design," the experimental tool's site says. "We then can infer the layout of the design from the position of the identified elements and generate the final HTML code accordingly."

The process is aided by several Azure cloud services. Specifically, Microsoft listed the following components that power the tool:

  • A Microsoft Custom Vision Model: This model has been trained with images of different handwritten designs tagging the information of most common HTML elements like buttons, text box, and images.
  • A Microsoft Computer Vision Service: To identify the text written into a design element a Computer Vision Service is used.
  • An Azure Blob Storage: All steps involved in the HTML generation process are stored, including the original image, prediction results and layout grouping information.
  • An Azure Function: Serves as the backend entry point that coordinates the generation process by interacting with all the services.
  • An Azure Web site: User font-end to enable uploading a new design and see the generated HTML results.

Source code for the project is available on GitHub.

About the Author

David Ramel is an editor and writer at Converge 360.

comments powered by Disqus

Featured

  • Cloud-Focused .NET Aspire 9.1 Released

    Along with .NET 10 Preview 1, Microsoft released.NET Aspire 9.1, the latest update to its opinionated, cloud-ready stack for building resilient, observable, and configurable cloud-native applications with .NET.

  • Microsoft Ships First .NET 10 Preview

    Microsoft shipped .NET 10 Preview 1, introducing a raft of improvements and fixes across performance, libraries, and the developer experience.

  • C# Dev Kit Previews .NET Aspire Orchestration

    Microsoft's dev team has been busy updating the C# Dev Kit, a Visual Studio Code extension that enhances the C# development experience by providing tools for managing, debugging, and editing C# projects.

  • Hands On: New VS Code Insiders Build Creates Web Page from Image in Seconds

    New Vision support with GitHub Copilot in the latest Visual Studio Code Insiders build takes a user-supplied mockup image and creates a web page from it in seconds, handling all the HTML and CSS.

  • Naive Bayes Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the naive Bayes regression technique, where the goal is to predict a single numeric value. Compared to other machine learning regression techniques, naive Bayes regression is usually less accurate, but is simple, easy to implement and customize, works on both large and small datasets, is highly interpretable, and doesn't require tuning any hyperparameters.

Subscribe on YouTube

Upcoming Training Events