問題描述
PHP DOM 文檔回顯問題 (PHP DOMdocument echoing problem)
$content = '<!‑‑<sup><span style="font‑weight:bold;color:black;">0</span></sup><br/>‑‑>
<div class="popular‑video‑image">
<a href="video/Far+East+Movement ‑ Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement ‑ Like a G6>">
<img src="/images/topvideo/1.jpg" alt=""/>
</a>
<span class="popular‑video‑artist ellipsis"><a href="video/Far+East+Movement ‑ Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement ‑ Like a G6>" class="ellipsis">Far East Movement</a></span>
<span class="popular‑video‑title ellipsis"><a href="video/Far+East+Movement ‑ Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement ‑ Like a G6>" class="ellipsis">Like a G6</a></span>
</div>';
$dom = new DOMDocument;
$dom‑>preserveWhiteSpace = false;
$dom‑>loadHTML($content);
foreach ($dom‑>getElementsByTagName('a') as $node)
{
$node‑>setAttribute('href', 'http://mysite.ru/' . $node‑>getAttribute('href'));
}
$dom‑>formatOutput = true;
echo $dom‑>saveXml($dom‑>documentElement);
輸出:
<html>
<body>
<div class="popular‑video‑image">
<a href="http://mysite.ru/video/Far+East+Movement ‑ Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement ‑ Like a G6>">
<img src="/images/topvideo/1.jpg" alt=""/></a>
<span class="popular‑video‑artist ellipsis"><a href="http://mysite.ru/video/Far+East+Movement ‑ Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement ‑ Like a G6>" class="ellipsis">Far East Movement</a></span>
<span class="popular‑video‑title ellipsis"><a href="http://mysite.ru/video/Far+East+Movement ‑ Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement ‑ Like a G6>" class="ellipsis">Like a G6</a></span>
</div>
</body>
</html>
我不想添加 html 和 body 標籤。也不希望將標記替換為 <lang>
。而且
也是沒必要的。
我想接收這樣的內容,就是在入口處,只有修改過的鏈接..
抱歉英語不好!
## 參考解法 #### 方法 1:
You are seeing
at the end of each line because your HTML has Windows‑style line endings CR+LF
. To get rid of them, run this on it before you feed it into DOMDocument
— to convert them to Unix‑style line endings LF
:
$content = preg_replace('/\r\n/', "\n", $content);
方法 2:
saveXml takes an optional parameter to allow you to specify the node to output.
$dom‑>saveXml($dom‑>documentElement‑>firstChild‑>firstChild);
This will remove the html and body tags from the output.
方法 3:
I guess that the <html>
and <body>
tags get placed in because you are using loadHTML
. Try using loadXML
instead.
As for <lang>
, it has to be replaced because otherwise the resulting XML would not be valid. If it is causing you problems, you should change your approach a little and work with it, not against it.
方法 4:
<?php
$content = '<!‑‑<sup><span style="font‑weight:bold;color:black;">0</span></sup><br/>‑‑>
<div class="popular‑video‑image">
<a href="video/Far+East+Movement ‑ Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement ‑ Like a G6>">
<img src="/images/topvideo/1.jpg" alt=""/>
</a>
<span class="popular‑video‑artist ellipsis"><a href="video/Far+East+Movement ‑ Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement ‑ Like a G6>" class="ellipsis">Far East Movement</a></span>
<span class="popular‑video‑title ellipsis"><a href="video/Far+East+Movement ‑ Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement ‑ Like a G6>" class="ellipsis">Like a G6</a></span>
</div>';
$dom = new DOMDocument;
$dom‑>preserveWhiteSpace = false;
$dom‑>loadHTML($content);
foreach ($dom‑>getElementsByTagName('a') as $node)
{
$node‑>setAttribute('href', 'http://mysite.ru/' . $node‑>getAttribute('href'));
}
$dom‑>formatOutput = true;
echo preg_replace('#^<!DOCTYPE.+?>#', '', str_replace( array('<html>', '</html>', '<body>', '</body>', "\n\n", '<', '>'), array('', '', '', '', '', '<', '>',), $dom‑>saveHTML()));
(by Isis、mattalxndr、Stephen Curran、Jon、Isis)