Extensible Markup Language (XML)
XML, which stands for Extensible Markup Language, is a flexible text-based format widely used to structure, store, and transport data. Created in the late 1990s, XML is both human-readable and machine-readable. Its design is focused on simplicity, generality, and usability across the Internet. XML is a meta-language, allowing users to define their own customized markup languages for limitless different types of documents. It is similar to HTML in appearance, but unlike HTML, which is designed to display data with specific pre-defined tags, XML allows developers to create their own tags, providing a way to classify information categorically. This ability makes it exceptionally versatile for various applications such as web development, software configuration, data exchange in business environments, and more. XML is also widely used as a format for document storage and processing.
Functions of XML:
-
Data Storage and Transport:
XML provides a platform-independent format for storing data, making it easy to transport data across different systems.
-
Data Description:
XML is self-descriptive, meaning it can include tags that describe the data, making the data easy to read and understand.
-
Data Interchange:
XML is widely used in the exchange of a wide variety of data on the web and elsewhere.
-
Configuration Files:
XML is commonly used for configuration files in many applications and devices because it is easy to read and write by both humans and machines.
-
Web Services:
XML is a key component of SOAP (Simple Object Access Protocol) used in web services to encode messages.
-
Document Structuring:
Because it allows users to define their own tags, XML is excellent for structuring documents with complex hierarchies or detailed metadata.
-
Database Interfacing:
XML can be used to exchange data between a database and a user or between different databases.
-
Simplification of Data Sharing:
XML simplifies data sharing between different computing systems, particularly via the Internet, ensuring that structured data will be uniformly and consistently decoded.
Example of XML:
Here is a simple example of an XML document that stores information about books in a library. The XML format is used to define and organize the data, which includes details such as the book title, author, and ISBN:
<?xml version=”1.0″ encoding=”UTF-8″?>
<library>
<book>
<title>The Great Gatsby</title>
<author>F. Scott Fitzgerald</author>
<isbn>9780743273565</isbn>
<year>1925</year>
</book>
<book>
<title>To Kill a Mockingbird</title>
<author>Harper Lee</author>
<isbn>9780060935467</isbn>
<year>1960</year>
</book>
<book>
<title>1984</title>
<author>George Orwell</author>
<isbn>9780451524935</isbn>
<year>1949</year>
</book>
</library>
Explanation:
- <?xml version=”1.0″ encoding=”UTF-8″?> – This line is the XML declaration and specifies the XML version and the character encoding used in the document.
- <library> – This is the root element of the XML document. All other elements are nested inside this element.
- <book> – Represents each book in the library. This is a child element of the <library>
- <title>, <author>, <isbn>, <year> – These tags represent elements that store information about the book’s title, author, ISBN number, and publication year.
Hypertext Markup Language (HTML)
HTML, which stands for Hypertext Markup Language, is the standard markup language used to create and structure content on the web. It enables the creation of web pages and web applications through the use of tags and attributes. HTML tags, such as <html>, <body>, <div>, <span>, and <p>, describe the webpage’s structure, while attributes within these tags provide additional details like styles and identifiers. HTML is also responsible for organizing content such as text, images, and links into a coherent format that browsers can interpret and display. As the backbone of web development, HTML is used alongside CSS (Cascading Style Sheets) and JavaScript to design interactive and stylistically versatile web pages.
Functions of HTML:
-
Structure Content:
HTML provides the basic structure of web pages, which is enhanced and modified by other technologies like CSS and JavaScript. It organizes text, images, videos, and other content into a logical format.
-
Create Hyperlinks:
HTML enables the creation of hyperlinks using the <a> tag, allowing users to navigate between different web pages and websites, which is fundamental to the web’s interconnected nature.
-
Embed Media:
HTML supports embedding images, audio, and video into web pages using tags such as <img>, <audio>, and <video>, enabling multimedia integration without needing external plugins.
-
Form Handling:
HTML forms, created using the <form> tag and related elements like <input>, <textarea>, and <button>, allow users to submit data to a server, facilitating user interactions and data entry on websites.
-
Semantic Meaning:
HTML5 introduced semantic elements like <article>, <section>, <header>, <footer>, and <nav>, which describe the role or meaning of the parts of web pages, helping search engines and assistive technologies to understand the content better.
-
Accessibility Features:
HTML supports web accessibility through elements like <alt> attributes for images and semantic layout for assistive technologies, making web content accessible to people with disabilities.
-
Scripting Integration:
HTML provides the structure within which client-side scripting (such as JavaScript) interacts. Scripts can be embedded within HTML documents to enhance functionality and interactivity.
-
Document Metadata:
HTML allows the definition of metadata for a web page using elements such as <title>, <meta>, <link>, and <style>, which provide information about the document, link to CSS, and control other head elements.
Example of HTML:
This HTML code creates a basic web page with a heading, a paragraph, an image, and a link.
<!DOCTYPE html>
<html lang=”en”>
<head>
<meta charset=”UTF-8″>
<title>Sample Web Page</title>
</head>
<body>
<h1>Welcome to My Website</h1>
<p>This is a sample paragraph to show how HTML structures content on a web page. Enjoy browsing!</p>
<img src=”example.jpg” alt=”Example Image”>
<p>Click <a href=”https://www.example.com”>here</a> to visit our main page.</p>
</body>
</html>
Breakdown of the HTML Elements:
- <!DOCTYPE html>: Declares the document type and version of HTML (HTML5 here).
- <html lang=”en”>: The root element of an HTML page, with a language attribute set to English.
- <head>: Contains meta-information about the HTML document.
- <meta charset=”UTF-8″>: Specifies the character encoding for the document (UTF-8 here).
- <title>: Sets the title of the HTML document, which appears in the browser title bar.
- <body>: Contains the content of the web page that is visible to users.
- <h1>: Represents a level-one heading on the web page.
- <p>: Defines a paragraph.
- <img src=”example.jpg” alt=”Example Image”>: Embeds an image with a source (src) and an alternative text (alt) if the image cannot be displayed.
- <a href=”https://www.example.com”>: Defines a hyperlink that points to “https://www.example.com” with clickable text “here”.
Key differences between XML and HTML
Aspect | XML | HTML |
Primary Purpose | Data transport | Web page structure |
Tags Defined | User-defined | Pre-defined |
Syntax Strictness | Very strict | Less strict |
Display in Browsers | Not for display | Intended for display |
Doctype Declaration | Optional | Recommended |
Hierarchical Structure | Strictly hierarchical | Loosely hierarchical |
Error Handling | Strict | Lenient |
File Extension | .xml | .html or .htm |
Styling | XSLT used | CSS used |
Interactivity | None directly | Supported via scripts |
Parsing Mode | Well-formedness required | Tag soup acceptable |
Namespace Support | Supports namespaces | No namespaces |
Data Types Representation | Any data as text | Mostly document content |
Integration with Technologies | Common in data exchange | Integral to web tech |
Usage Context | Data serialization | Document markup |
Key Similarities between XML and HTML
-
Markup Languages:
Both XML and HTML are markup languages used to encode documents in a format that is both human-readable and machine-readable.
- Tag-Based:
They both use tags (elements enclosed within angle brackets) to annotate the text or data.
-
Hierarchy of Elements:
Both languages organize data in a nested or hierarchical structure, where elements can contain other elements.
-
Based on SGML:
HTML and XML are derived from the Standard Generalized Markup Language (SGML), hence they share a common syntax foundation.
-
Use of Attributes:
In both XML and HTML, elements can have attributes specified in the start tag to provide additional information about the element.
- Encoding:
They both support different character encoding schemes, allowing for the use of various character sets and special characters.
-
Text Files:
Both are typically stored as plain text files, which can be accessed and edited with standard text editors.
-
Web Usage:
Both are utilized extensively on the web—HTML for structuring and presenting content on web pages, and XML for the storage and transport of arbitrary data.
-
Separation from Style:
In modern usage, both XML and HTML separate content from styling, though the mechanisms differ (CSS for HTML, XSL for XML).