Need a way to always receive a tag position in bytes

Html code
```html
<!DOCTYPE html>
<html>
    <head>
        <meta http-equiv='content-type' content='text/html; charset=windows-1251'>
        <title>кириллица кириллица кириллица кириллица кириллица кириллица</title>
    </head>
    <body>
        <img src='/biz-globus-sea-32X32.webp'>
    </body>
</html>
```
PHP code
```php
$body = file_get_contents('./index.html');
var_dump(strlen($body));
$html = \duzun\hQuery::fromHTML($body);
$imageNodes = $html->find('img') ?? [];
foreach($imageNodes as $pos => $imageNode) {
    var_dump($pos);
}
```
Output 
```
int(298)
int(329) <- the position out of page length
```
It happens because of `<meta http-equiv='content-type' content='text/html; charset=windows-1251'>`. The library try to count position in characters.

Using multybite character position is bad idea because of emoji. Need a way to disable using of encoding data.

At the moment I'm using this hard code to always receive the tag position in bytes:
```php
$body = preg_replace_callback(["/<meta[^>]*http-equiv=('|\")content-type('|\")[^>]*>/Ui", "/<meta[^>]*charset=('|\")[^'\"]+('|\")[^>]*>/Ui"], function($matches) {
    $repeat = strlen($matches[0]);
    return str_repeat(' ', $repeat);
}, $body);
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need a way to always receive a tag position in bytes #100

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Need a way to always receive a tag position in bytes #100

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions