How to Read html page in python
Html is a Hyper Text Markup language is a standard language used for creating webpages. HTML is the name of the language used to describe the construction of Web pages. HTML was initially created to define the goal of defining the structure of documents, such as headings, paragraphs, lists, and other elements, to make it easier for researchers to share scientific data.
Web browsers download HTML documents from a web server or local storage, converting them into multimedia web pages. HTML's initial iteration included visual cues for the document's appearance and semantic descriptions of a web page's structure.
Reading the HTML Page
1) Requests. get(url)
Import the Python library requests, which manages the server-side details of requesting websites in a simple format that is simple to use. To access the website, call requests. Get (...) and pass the argument to the URL "https://google.com" so the function knows where to go. Look at the get request's actual body (the return value is a request object containing useful meta information like the file type, etc.).
Example
import requests
print(requests.get(url = 'https://google.com').text)
2) urllib.request:
The urllib. Request () function is a suggested method for obtaining web resources from a website. This also works to make a straightforward Python 3 one-liner to access the Google website as before:
Example
import urllib.request as r
page = r.urlopen('https://google.com')
print(page.read())
Example: “Hello World” html page in python:
The concept that a file that appears to contain code from one angle can be viewed as data from another is one of the more potent ones in computer science. Creating programs that control other programs is thus feasible. We'll use Python to produce a "Hello World!"-themed HTML file next. To accomplish this, we'll save the HTML tags in a multiline Python string and create a new file from its contents. The.html extension will be used to save this document rather than the.txt extension.
Normally, a doctype declaration is the first line of an HTML file. This was demonstrated in a previous lesson when you created an HTML "Hello World" program. We will leave out the doctype in this example to make it simpler to read our code. Remember that the text is encased in three quotation marks to create a multi-line string?
The below example shows the html page in python:
# write-html.py
f = open('helloworld.html','w')
message = """<html>
<head></head>
<body><p>Hello World!</p></body>
</html>"""
f.write(message)
f.close()
Go to your Firefox browser, select File -> New Tab, then select File -> Open File on the tab. Choose helloworld.html. Your message should now be visible in the browser. Consider the moment that you can now create a program that will automatically create a webpage. There is no reason why, if you wanted to, you couldn't write a program that would instantly build an entire website.
Using Os and Webbrowser
The Python webbrowser module provides a high-level interface for displaying Web-based documents, the user's high-level interface for displaying Web-based documents to users is provided by the Python webbrowser module, and the OS module provides a portable method of accessing operating system-dependent functionality. The webbrowser module offers a high-level interface that lets users view documents users to view documents that are hosted on the Web. In most cases, simply calling this module's open() function will open a URL in the default browser. The module must be imported, and the open() function must be used.
According to the overview, dynamic web applications usually involve gathering data from a web page form, processing it in a server-side program, and displaying the results on a web page.
This section uses familiar keyboard input into a typical Python program before processing the input and producing the final web page output, just like in the final version, because introducing all these new concepts at once might be difficult for the reader to understand. Let's look at how using these two together can help us open an HTML page in the Chrome browser:
Used function: Use the open new tab feature of your default browser to open an HTML file in a new tab ()
Syntax
open_new_tab(filename)
Example
import webbrowser
import os
f = open('GFG.html', 'w')
html_template = """
<html>
<head></head>
<body>
<p>HTML PAGE IN PYTHON</p>
</body>
</html>
"""
f.write(html_template)
f.close()
filename = 'file:///'+os.getcwd()+'/' + 'WP.html'
webbrowser.open_new_tab(filename)
Output
HTML PAGE IN PYTHON