This is the way it should be:
function file_get_contents_curl($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$html = file_get_contents_curl("http://example.com/");
//parsing begins here:
$doc = new DOMDocument();
@$doc->loadHTML($html);
$nodes = $doc->getElementsByTagName('title');
//get and display what you need:
$title = $nodes->item(0)->nodeValue;
$metas = $doc->getElementsByTagName('meta');
for ($i = 0; $i < $metas->length; $i++)
{
$meta = $metas->item($i);
if($meta->getAttribute('name') == 'description')
$description = $meta->getAttribute('content');
if($meta->getAttribute('name') == 'keywords')
$keywords = $meta->getAttribute('content');
}
echo "Title: $title". '<br/><br/>';
echo "Description: $description". '<br/><br/>';
echo "Keywords: $keywords";
< ? php
// Assuming the above tags are at www.example.com
$tags = get_meta_tags('http://www.example.com/');
// Notice how the keys are all lowercase now, and
// how . was replaced by _ in the key.
echo $tags['author']; // name
echo $tags['keywords']; // php documentation
echo $tags['description']; // a php manual
echo $tags['geo_position']; // 49.33;-86.59
?
>
get_meta_tags
will help you with all but the title. To get the title just use a regex.
$url = 'http://some.url.com';
preg_match("/<title>(.+)<\ /title>/siU", file_get_contents($url), $matches);
$title = $matches[1];
Only meta tags with name attributes like
<meta name="description" content="the description">
Unfortunately, the built in php function get_meta_tags() requires the name parameter, and certain sites, such as twitter leave that off in favor of the property attribute. This function, using a mix of regex and dom document, will return a keyed array of metatags from a webpage. It checks for the name parameter, then the property parameter. This has been tested on instragram, pinterest and twitter.
/**
* Extract metatags from a webpage
*/
function extract_tags_from_url($url) {
$tags = array();
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$contents = curl_exec($ch);
curl_close($ch);
if (empty($contents)) {
return $tags;
}
if (preg_match_all('/<meta([^>]+)content="([^>]+)>/', $contents, $matches)) {
$doc = new DOMDocument();
$doc->loadHTML('<?xml encoding="utf-8" ?>' . implode($matches[0]));
$tags = array();
foreach($doc->getElementsByTagName('meta') as $metaTag) {
if($metaTag->getAttribute('name') != "") {
$tags[$metaTag->getAttribute('name')] = $metaTag->getAttribute('content');
}
elseif ($metaTag->getAttribute('property') != "") {
$tags[$metaTag->getAttribute('property')] = $metaTag->getAttribute('content');
}
}
}
return $tags;
}
The chosen answer is good but doesn't work when a site is redirected (very common!), and doesn't return OG tags, which are the new industry standard. Here's a little function which is a bit more usable in 2018. It tries to get OG tags and falls back to meta tags if it cant them:
function getSiteOG($url, $specificTags = 0) {
$doc = new DOMDocument();
@$doc - > loadHTML(file_get_contents($url));
$res['title'] = $doc - > getElementsByTagName('title') - > item(0) - > nodeValue;
foreach($doc - > getElementsByTagName('meta') as $m) {
$tag = $m - > getAttribute('name') ? : $m - > getAttribute('property');
if (in_array($tag, ['description', 'keywords']) || strpos($tag, 'og:') === 0) $res[str_replace('og:', '', $tag)] = $m - > getAttribute('content');
}
return $specificTags ? array_intersect_key($res, array_flip($specificTags)) : $res;
}
How to use it:
/////////////
//SAMPLE USAGE:
print_r(getSiteOG("http://www.stackoverflow.com")); //note the incorrect url
/////////////
//OUTPUT:
Array
(
[title] => Stack Overflow - Where Developers Learn, Share, & Build Careers[description] => Stack Overflow is the largest, most trusted online community
for developers to learn, shareâ âtheir programming âknowledge, and build their careers.
[type] => website[url] => https: //stackoverflow.com/
[site_name] => Stack Overflow[image] => https: //cdn.sstatic.net/Sites/stackoverflow/img/apple-touch-icon@2.png?v=73d79a89bded
)
In this article, I am going to show you how to get webpage Titles and Meta tags from external website URLs using PHP. Scroll down to the page so you can see the full source code available for this.,Title: Home - codeat21.com - Web Designing and Development Tutorials Website.,php get page title from url, curl get meta tags,Mostly, All external websites are used to specify the 3 common metadata for the web page are page title, page description, and page keywords. These all meta tags information fetched using PHP. The page description and page keywords within tag and page title is <title> and </title> tag.
codeat21 <?php
// Web page URL
$url = 'https://codeat21.com/';
// Extract HTML using curl
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$data = curl_exec($ch);
curl_close($ch);
// Load HTML to DOM object
$dom = new DOMDocument();
@$dom->loadHTML($data);
// Parse DOM to get Title data
$nodes = $dom->getElementsByTagName('title');
$title = $nodes->item(0)->nodeValue;
// Parse DOM to get meta data
$metas = $dom->getElementsByTagName('meta');
$description = '';
$keywords = '';
$site_name = '';
$image = '';
for($i=0; $i<$metas->length; $i++){
$meta = $metas->item($i);
if($meta->getAttribute('name') == 'description'){
$description = $meta->getAttribute('content');
}
if($meta->getAttribute('name') == 'keywords'){
$keywords = $meta->getAttribute('content');
}
if($meta->getAttribute('property') == 'og:site_name'){
$site_name = $meta->getAttribute('content');
}
if($meta->getAttribute('property') == 'og:image'){
$image = $meta->getAttribute('content');
}
}
echo "Title: $title". '<br/><br/>';
echo "Description: $description". '<br/><br/>';
echo "Keywords: $keywords". '<br/><br/>';
echo "site_name: $site_name". '<br/><br/>';
echo "image: $image";
?>
Extracting all meta tags is an easy task in PHP as there is a built-in function available. The get_meta_tags() function in PHP will able to extract all meta tags from a given URL. All you need is just to pass the URL as a parameter.,Quite simple, isn’t it? Yes, the above two lines of code are enough to extract all the meta tags of a web page. The last line is for printing the result on the web page. If we run our code, then it will print an array that contains all the meta tags.,In this article, I am going to show you how to extract all meta tags of a web page in PHP. After we get meta tags of a web page, we are going to get the specific meta content by its name.,The meta tags name in our array is actually the index which contains the value of meta tags. So we are just picking the value by the index or you can say by the meta name.
A meta tag of a web page looks like you can see below:
<meta name="distribution" content="This is the content of our meta tag" />
In our example, we are going to use “https://www.domain.com/” to retrieve the meta tags. Let’s see our code below:
<?php
$url = "https://www.domain.com/";
$metas = get_meta_tags($url);
echo "<pre>"; print_r($metas); echo "</pre>";
?>
Now we can get any specific meta tag value by its name just like you can see below:
< ? php
$description = $metas['description'];
$keywords = $metas['keywords'];
$rating = $metas['rating']; ?
>
You can find your page’s meta description within the <head> section of the page’s HTML markup.,Most CMSs will allow you to edit this markup and change your meta description either directly within the code or via the meta description field within the page’s metadata settings.,As with title tags, each page’s meta description should be directly relevant to the page it describes and unique from the descriptions for other pages. Otherwise, you'll end up with SERP results that look like this:,One way to combat duplicate meta descriptions is to implement a dynamic and programmatic way to create unique meta descriptions for automated pages. If you have the resources, though, there's no substitute for an original description written specifically for each page.
HTML code example
<head>
<meta name="description" content="This is an example of a
meta description. This will often show up in search results.">
</head>