in

Enhancing Embedded Content: A Focus on Web Browser Engineering



**Supporting Embedded Content**

Supporting embedded content in web browsers is crucial for rendering web pages that contain non-text elements such as images and other web pages. This feature has significant implications for browser architecture, performance, security, and open information access. Throughout the history of the web, support for embedded content has played a key role.

**Images: The Most Popular Embedded Content**

Images are the most popular form of embedded content on the web. However, images are only introduced in Chapter 15 of this book due to limitations in Tkinter’s support for image formats and sizing/clipping. This delay is a result of the history of embedded content on the web, dating back to the early 1990s.

Inconsistencies in the way images are included on web pages, such as the use of “src” versus “href” or “img” versus “image,” can be attributed to this history. The inclusion of images on web pages is typically done through the tag, as shown below:

This tag renders an image like this: Hypertext Editing System (Gregory Lloyd from Wikipedia, CC BY 2.0)

**Downloading and Displaying Images**

To display images in our browser, we need to follow four steps:

1. Download the image from a provided URL.
2. Decode the image into a memory buffer.
3. Lay out the image on the web page.
4. Paint the image on the display list.

Let’s start with the first step of downloading images from a URL. Since images use binary data formats instead of textual data, we need to extend our existing HTTP request function to support binary data. This can be done by making a small change: instead of passing the “r” flag to makefile, we pass a “b” flag that indicates binary mode:

“`Python
def request(url, top_level_url, payload=None):
# …
response = s.makefile(“b”)
# …
“`

With this change, every time we read from the response, we receive bytes of binary data instead of a string with textual data. Therefore, we need to update our HTTP parser code to explicitly decode the data:

“`Python
def request(url, top_level_url, payload=None):
# …
statusline = response.readline().decode(“utf8”)
# …
while True:
line = response.readline().decode(“utf8”)
# …
# …
“`

Note that we don’t add a decode call when reading the body because the body might contain binary data that needs to be returned directly to the browser.

For every existing call to the request function where textual data is expected, we need to decode the response. For example, in the load function, we can make the following change:

“`Python
class Tab:
def load(self, url, body=None):
# …
headers, body = request(url, self.url, body)
body = body.decode(“utf8”)
# …
“`

Make sure to apply this change wherever the request function is called throughout the browser, including XMLHttpRequest_send and other places in the load function.

However, when we download images, we don’t call decode and instead use the binary data directly:

“`Python
class Tab:
def load(self, url, body=None):
# …
images = [node for node in tree_to_list(self.nodes, []) if isinstance(node, Element) and node.tag == “img”]
for img in images:
src = img.attributes.get(“src”, “”)
image_url = resolve_url(src, self.url)
assert self.allowed_request(image_url), “Blocked load of ” + image_url + ” due to CSP”
header, body = request(image_url, self.url)
“`

**Converting Binary Data to Skia Image Objects**

Once an image has been downloaded, we need to convert it into a Skia Image object. This can be done using the following code:

“`Python
class Tab:
def load(self, url, body=None):
for img in images:
# …
img.encoded_data = body
data = skia.Data.MakeWithoutCopy(body)
img.image = skia.Image.MakeFromEncoded(data)
“`

This code has two important steps. First, the downloaded data is turned into a Skia Data object using the MakeWithoutCopy method. Then, the data is converted into an image using MakeFromEncoded. By using MakeWithoutCopy, we avoid unnecessary memory usage and time consumption by referencing the existing body data instead of copying it. To ensure the data remains valid, we save the body in an encoded_data field.

Although this approach works, a better solution would be to directly write the response into a Skia Data object using the writable_data API. However, implementing this change would require significant refactoring of the browser code, which is why we have chosen to stick with the current method for simplicity.

**Handling Failure and Broken Images**

It is essential to handle failure cases during the download and decoding steps. If an error occurs, we can display a “broken image” placeholder, as shown below:

“`Python
BROKEN_IMAGE = skia.Image.open(“Broken_Image.png”)

class Tab:
def load(self, url, body=None):
for img in images:
try:
# …
except Exception as e:
print(“Exception loading image: url=” + image_url + ” exception=” + str(e))
img.image = BROKEN_IMAGE
“`

This ensures that even if an image cannot be loaded, the user will see a placeholder image instead of a broken or missing image.

**Rendering Downloaded Images**

After downloading and decoding the images, we need to render them on the web page. Skia, our rendering engine, automatically handles the decoding process. Therefore, drawing the image is straightforward:

“`Python
class DrawImage(DisplayItem):
def __init__(self, image, rect):
super().__init__(rect)
self.image = image

def execute(self, canvas):
canvas.drawImageRect(self.image, self.rect)
“`

Skia applies various optimizations to the decoding process, such as decoding the image to its eventual size and caching the decoded image as long as possible. Additionally, there is an HTML API that allows web page authors to control the decoding process, enabling them to indicate when to pay the decoding cost. Since decoding image data requires significant memory and time, optimizing image handling is crucial for a performant browser.

**MIME Types and Content Types**

The HTTP Content-Type header informs the browser whether a document contains text or binary data. It provides a value called a MIME type, which specifies the type of the content. For example, text/html, text/css, and text/javascript indicate HTML, CSS, and JavaScript files, respectively. Similarly, image/png and image/jpeg are the MIME types for PNG and JPEG images. There are various MIME types for different font, video, audio, and data formats.

Originally used for enumerating acceptable data formats for email attachments, MIME (Multipurpose Internet Mail Extensions) now plays a critical role in web content. Most email clients are accessed through web browsers, and emails are often HTML-encoded using the text/html MIME type. However, some email clients still provide an option to encode emails in text/plain.

In the code provided, we didn’t explicitly handle the image format because Skia automatically handles the decoding process for different image formats.

**Conclusion**

Supporting embedded content, such as images, is an essential aspect of web browser engineering. By following the steps outlined in this article, we can download, decode, and render images in a browser. The changes made to the request function, the conversion of binary data to Skia Image objects, and the handling of failure cases ensure a smooth experience for users when dealing with embedded content. Additionally, understanding MIME types and content types is crucial for correctly interpreting and rendering various types of content on the web.



Leave a Reply

Your email address will not be published. Required fields are marked *

GIPHY App Key not set. Please check settings

Supercharging Text To Image with Midjourney AI

Flawless SEO Expert & Elite Copywriter: Experience our Unparalleled YouTube Transcript Generator – Fast and Absolutely Free!