To get all links and titles if the "date" has the text "march":
Find the "date" - locate all <td>
elements that have the text "march".
Find the previous <a>
tag using the .find_previous()
method which contains the desired title and link.
import requests
from bs4 import BeautifulSoup
url = "https://www.pds.com.ph/index.html%3Fpage_id=3261.html"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
fmt_string = "{:<20} {:<130} {}"
print(fmt_string.format("Date", "Title", "Link"))
print('-' * 200)
for tag in soup.select("td:contains('March')"):
a_tag = tag.find_previous("a")
print(
fmt_string.format(
tag.text, a_tag.text, "https://www.pds.com.ph/" + a_tag["href"],
)
)
Output (truncated):
Date Title Link
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
March 31, 2021 RCBC Lists PHP 17.87257 Billion ASEAN Sustainability Bonds on PDEx https://www.pds.com.ph/index.html%3Fp=87239.html
March 16, 2021 Aboitiz Power Corporation Raises 8 Billion Fixed Rate Bonds on PDEx https://www.pds.com.ph/index.html%3Fp=86743.html
March 1, 2021 Century Properties Group, Inc Returns to PDEx with PHP 3 Billion Fixed Rate Bonds https://www.pds.com.ph/index.html%3Fp=86366.html
March 27, 2020 BPI Lists Over PhP 33 Billion of Fixed Rate Bonds on PDEx https://www.pds.com.ph/index.html%3Fp=74188.html
March 25, 2020 SM Prime Raises PHP 15 Billion Fixed Rate Bonds on PDEx https://www.pds.com.ph/index.html%3Fp=74082.html
...