# Chrome PHP [![Latest Stable Version](https://poser.pugx.org/chrome-php/chrome/version)](https://packagist.org/packages/chrome-php/chrome) [![License](https://poser.pugx.org/chrome-php/chrome/license)](https://packagist.org/packages/chrome-php/chrome) This library lets you start playing with chrome/chromium in headless mode from PHP. Can be used synchronously and asynchronously! ## Features - Open chrome or chromium browser from php - Create pages and navigate to pages - Take screenshots - Evaluate javascript on the page - Make PDF - Emulate mouse - Emulate keyboard - Always IDE friendly Happy browsing! ## Requirements Requires PHP 7.3-8.0 and a chrome/chromium 65+ executable. Note that the library is only tested on Linux but is compatible with macOS and Windows. ## Installation The library can be installed with Composer and is available on packagist under [chrome-php/chrome](https://packagist.org/packages/chrome-php/chrome): ```bash $ composer require chrome-php/chrome ``` ## Usage It uses a simple and understandable API to start chrome, to open pages, take screenshots, crawl websites... and almost everything that you can do with chrome as a human. ```php use HeadlessChromium\BrowserFactory; $browserFactory = new BrowserFactory(); // starts headless chrome $browser = $browserFactory->createBrowser(); try { // creates a new page and navigate to an URL $page = $browser->createPage(); $page->navigate('http://example.com')->waitForNavigation(); // get page title $pageTitle = $page->evaluate('document.title')->getReturnValue(); // screenshot - Say "Cheese"! 😄 $page->screenshot()->saveToFile('/foo/bar.png'); // pdf $page->pdf(['printBackground' => false])->saveToFile('/foo/bar.pdf'); } finally { // bye $browser->close(); } ``` ### Using different chrome executable When starting, the factory will look for the environment variable ``"CHROME_PATH"`` to use as the chrome executable. If the variable is not found, it will try to guess the correct executable path according to your OS or use ``"chrome"`` as the default. You are also able to explicitly set up any executable of your choice when creating a new object. For instance ``"chromium-browser"``: ```php use HeadlessChromium\BrowserFactory; // replace default 'chrome' with 'chromium-browser' $browserFactory = new BrowserFactory('chromium-browser'); ``` ### Debugging The following example disables headless mode to ease debugging ```php use HeadlessChromium\BrowserFactory; $browserFactory = new BrowserFactory(); $browser = $browserFactory->createBrowser([ 'headless' => false, // disable headless mode ]); ``` Other debug options: ```php [ 'connectionDelay' => 0.8, // add 0.8 second of delay between each instruction sent to chrome, 'debugLogger' => 'php://stdout', // will enable verbose mode ] ``` About ``debugLogger``: this can be any of a resource string, a resource, or an object implementing ``LoggerInterface`` from Psr\Log (such as [monolog](https://github.com/Seldaek/monolog) or [apix/log](https://github.com/apix/log)). ## API ### Browser Factory ```php use HeadlessChromium\BrowserFactory; $browserFactory = new BrowserFactory(); $browser = $browserFactory->createBrowser([ 'windowSize' => [1920, 1000], 'enableImages' => false, ]); ``` #### Options Here are the options available for the browser factory: | Option name | Default | Description | |---------------------------|---------|----------------------------------------------------------------------------------------------| | `connectionDelay` | `0` | Delay to apply between each operation for debugging purposes | | `customFlags` | none | An array of flags to pass to the command line. Eg: `['--option1', '--option2=someValue']` | | `debugLogger` | `null` | A string (e.g "php://stdout"), or resource, or PSR-3 logger instance to print debug messages | | `enableImages` | `true` | Toggles loading of images | | `envVariables` | none | An array of environment variables to pass to the process (example DISPLAY variable) | | `headless` | `true` | Enable or disable headless mode | | `ignoreCertificateErrors` | `false` | Set chrome to ignore ssl errors | | `keepAlive` | `false` | Set to `true` to keep alive the chrome instance when the script terminates | | `noSandbox` | `false` | Enable no sandbox mode, useful to run in a docker container | | `proxyServer` | none | Proxy server to use. usage: `127.0.0.1:8080` (authorisation with credentials does not work) | | `sendSyncDefaultTimeout` | `5000` | Default timeout (ms) for sending sync messages | | `startupTimeout` | `30` | Maximum time in seconds to wait for chrome to start | | `userAgent` | none | User agent to use for the whole browser (see page API for alternative) | | `userDataDir` | none | Chrome user data dir (default: a new empty dir is generated temporarily) | | `windowSize` | none | Size of the window. usage: `$width, $height` - see also Page::setViewport | ### Browser API #### Create a new page (tab) ```php $page = $browser->createPage(); ``` #### Close the browser ```php $browser->close(); ``` ### Set a script to evaluate before every page created by this browser will navigate ```php $browser->setPagePreScript('// Simulate navigator permissions; const originalQuery = window.navigator.permissions.query; window.navigator.permissions.query = (parameters) => ( parameters.name === 'notifications' ? Promise.resolve({ state: Notification.permission }) : originalQuery(parameters) );'); ``` ### Page API #### Navigate to an URL ```php // navigate $navigation = $page->navigate('http://example.com'); // wait for the page to be loaded $navigation->waitForNavigation(); ``` When using ``$navigation->waitForNavigation()`` you will wait for 30sec until the page event "loaded" is triggered. You can change the timeout or the event to listen for: ```php // wait 10secs for the event "DOMContentLoaded" to be triggered $navigation->waitForNavigation(Page::DOM_CONTENT_LOADED, 10000); ``` Available events (in the order they trigger): - ``Page::DOM_CONTENT_LOADED``: dom has completely loaded - ``Page::LOAD``: (default) page and all resources are loaded - ``Page::NETWORK_IDLE``: page has loaded, and no network activity has occurred for at least 500ms When you want to wait for the page to navigate 2 main issues may occur. First, the page is too long to load and second, the page you were waiting to be loaded has been replaced. The good news is that you can handle those issues using a good old try-catch: ```php use HeadlessChromium\Exception\OperationTimedOut; use HeadlessChromium\Exception\NavigationExpired; try { $navigation->waitForNavigation() } catch (OperationTimedOut $e) { // too long to load } catch (NavigationExpired $e) { // An other page was loaded } ``` #### Evaluate script on the page Once the page has completed the navigation you can evaluate arbitrary script on this page: ```php // navigate $navigation = $page->navigate('http://example.com'); // wait for the page to be loaded $navigation->waitForNavigation(); // evaluate script in the browser $evaluation = $page->evaluate('document.documentElement.innerHTML'); // wait for the value to return and get it $value = $evaluation->getReturnValue(); ``` Sometimes the script you evaluate will click a link or submit a form, in this case, the page will reload and you will want to wait for the new page to reload. You can achieve this by using ``$page->evaluate('some js that will reload the page')->waitForPageReload()``. An example is available in [form-submit.php](./examples/form-submit.php) #### Call a function This is an alternative to ``evaluate`` that allows calling a given function with the given arguments in the page context: ```php $evaluation = $page->callFunction( "function(a, b) {\n window.foo = a + b;\n}", [1, 2] ); $value = $evaluation->getReturnValue(); ``` #### Add a script tag That's useful if you want to add jQuery (or anything else) to the page: ```php $page->addScriptTag([ 'content' => file_get_contents('path/to/jquery.js') ])->waitForResponse(); $page->evaluate('$(".my.element").html()'); ``` You can also use an URL to feed the src attribute: ```php $page->addScriptTag([ 'url' => 'https://code.jquery.com/jquery-3.3.1.min.js' ])->waitForResponse(); $page->evaluate('$(".my.element").html()'); ``` #### Get the page HTML You can get the page HTML as a string using the ```getHtml``` method. ```php $html = $page->getHtml(); ``` ### Add a script to evaluate upon page navigation ```php $page->addPreScript('// Simulate navigator permissions; const originalQuery = window.navigator.permissions.query; window.navigator.permissions.query = (parameters) => ( parameters.name === 'notifications' ? Promise.resolve({ state: Notification.permission }) : originalQuery(parameters) );'); ``` If your script needs the dom to be fully populated before it runs then you can use the option "onLoad": ```php $page->addPreScript($script, ['onLoad' => true]); ``` #### Set viewport size This feature allows changing the size of the viewport (emulation) for the current page without affecting the size of all the browser's pages (see also option ``"windowSize"`` of [BrowserFactory::createBrowser](#options)). ```php $width = 600; $height = 300; $page->setViewport($width, $height) ->await(); // wait for the operation to complete ``` #### Make a screenshot ```php // navigate $navigation = $page->navigate('http://example.com'); // wait for the page to be loaded $navigation->waitForNavigation(); // take a screenshot $screenshot = $page->screenshot([ 'format' => 'jpeg', // default to 'png' - possible values: 'png', 'jpeg', 'quality' => 80, // only if format is 'jpeg' - default 100 ]); // save the screenshot $screenshot->saveToFile('/some/place/file.jpg'); ``` **Screenshot an area on a page** You can use the option "clip" to choose an area on a page for the screenshot ```php use HeadlessChromium\Clip; // navigate $navigation = $page->navigate('http://example.com'); // wait for the page to be loaded $navigation->waitForNavigation(); // create a rectangle by specifying to left corner coordinates + width and height $x = 10; $y = 10; $width = 100; $height = 100; $clip = new Clip($x, $y, $width, $height); // take the screenshot (in memory binaries) $screenshot = $page->screenshot([ 'clip' => $clip, ]); // save the screenshot $screenshot->saveToFile('/some/place/file.jpg'); ``` **Full-page screenshot** You can also take a screenshot for the full-page layout (not only the viewport) using ``$page->getFullPageClip`` with attribute ``captureBeyondViewport = true`` ```php // navigate $navigation = $page->navigate('https://example.com'); // wait for the page to be loaded $navigation->waitForNavigation(); $screenshot = $page->screenshot([ 'captureBeyondViewport' => true, 'clip' => $page->getFullPageClip(), 'format' => 'jpeg', // default to 'png' - possible values: 'png', 'jpeg', ]); // save the screenshot $screenshot->saveToFile('/some/place/file.jpg'); ``` #### Print as PDF ```php // navigate $navigation = $page->navigate('http://example.com'); // wait for the page to be loaded $navigation->waitForNavigation(); $options = [ 'landscape' => true, // default to false 'printBackground' => true, // default to false 'displayHeaderFooter' => true, // default to false 'preferCSSPageSize' => true, // default to false (reads parameters directly from @page) 'marginTop' => 0.0, // defaults to ~0.4 (must be a float, value in inches) 'marginBottom' => 1.4, // defaults to ~0.4 (must be a float, value in inches) 'marginLeft' => 5.0, // defaults to ~0.4 (must be a float, value in inches) 'marginRight' => 1.0, // defaults to ~0.4 (must be a float, value in inches) 'paperWidth' => 6.0, // defaults to 8.5 (must be a float, value in inches) 'paperHeight' => 6.0, // defaults to 8.5 (must be a float, value in inches) 'headerTemplate' => '