diff --git a/README.md b/README.md index 050ce8f5061..9fbb13d1351 100644 --- a/README.md +++ b/README.md @@ -12,7 +12,7 @@ Nomulus is an open source, scalable, cloud-based service for operating [top-level domains](https://en.wikipedia.org/wiki/Top-level_domain) (TLDs). It is the authoritative source for the TLDs that it runs, meaning that it is responsible for tracking domain name ownership and handling registrations, -renewals, availability checks, and WHOIS requests. End-user registrants (i.e., +renewals, availability checks, and RDAP requests. End-user registrants (i.e., people or companies that want to register a domain name) use an intermediate domain name registrar acting on their behalf to interact with the registry. @@ -97,7 +97,7 @@ Nomulus has the following capabilities: for details), and an implementation based on [Google Cloud Secret Manager](https://cloud.google.com/security/products/secret-manager) is available. -* **TPC Proxy**: Nomulus is built on top of the [Jetty](https://jetty.org/) +* **TCP Proxy**: Nomulus is built on top of the [Jetty](https://jetty.org/) container that implements the [Jakarta Servlet](https://jakarta.ee/specifications/servlet/) specification and only serves HTTP/S traffic. A proxy to translate raw TCP traffic (e.g., EPP) to and from HTTP is provided. diff --git a/console-webapp/README.md b/console-webapp/README.md index 99866a14075..553fcca5846 100644 --- a/console-webapp/README.md +++ b/console-webapp/README.md @@ -9,15 +9,14 @@ expected to change. ## Deployment -The webapp is deployed with the nomulus default service war to GKE. -During nomulus default service war build task, gradle script triggers the -following: +The webapp is deployed as part of the default Nomulus GKE service image. +During the image build task, the Gradle script triggers the following: 1) Console webapp build script `buildConsoleWebapp`, which installs dependencies, assembles a compiled ts -> js, minified, optimized static artifact (html, css, js) -2) Artifact assembled in step 1 then gets copied to core project web artifact - location, so that it can be deployed with the rest of the core webapp +2) Artifact assembled in step 1 then gets copied to the jetty webapp resource + location, so that it can be staged inside the default GKE service container. ## Development server diff --git a/core/src/main/java/google/registry/config/files/tasks/cloud-scheduler-tasks-production.xml b/core/src/main/java/google/registry/config/files/tasks/cloud-scheduler-tasks-production.xml index a6b57a01d24..25d385a7aa6 100644 --- a/core/src/main/java/google/registry/config/files/tasks/cloud-scheduler-tasks-production.xml +++ b/core/src/main/java/google/registry/config/files/tasks/cloud-scheduler-tasks-production.xml @@ -266,16 +266,6 @@ 0 15 * * * - - - wipeOutContactHistoryPii - - This job runs weekly to wipe out PII fields of ContactHistory entities - that have been in the database for a certain period of time. - - 0 15 * * 1 - - bsaDownload diff --git a/core/src/main/java/google/registry/config/files/tasks/cloud-scheduler-tasks-sandbox.xml b/core/src/main/java/google/registry/config/files/tasks/cloud-scheduler-tasks-sandbox.xml index 11451d54fad..79bc6022464 100644 --- a/core/src/main/java/google/registry/config/files/tasks/cloud-scheduler-tasks-sandbox.xml +++ b/core/src/main/java/google/registry/config/files/tasks/cloud-scheduler-tasks-sandbox.xml @@ -155,16 +155,6 @@ */1 * * * * - - - wipeOutContactHistoryPii - - This job runs weekly to wipe out PII fields of ContactHistory entities - that have been in the database for a certain period of time. - - 0 15 * * 1 - - bsaDownload diff --git a/core/src/test/java/google/registry/webdriver/README.md b/core/src/test/java/google/registry/webdriver/README.md index ee6c694100a..e107b5b099e 100644 --- a/core/src/test/java/google/registry/webdriver/README.md +++ b/core/src/test/java/google/registry/webdriver/README.md @@ -12,7 +12,7 @@ for more information on using webdriver. 2. Missing golden images * If you added a new test using screenshot comparison, you have to generate the golden image for that test in advance and copy it to - [goldens/](https://github.com/google/nomulus/tree/master/core/src/test/java/google/registry/webdriver/goldens) + [goldens/](https://github.com/google/nomulus/tree/master/core/src/test/resources/google/registry/webdriver/goldens) folder. There is an auxiliary Gradle build task to help with this, and here are some examples: ```shell diff --git a/docs/architecture.md b/docs/architecture.md index a09da323a90..4964f8e8486 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -102,6 +102,14 @@ Here are the task queues in use by the system: run infrequently, such as exporting reserved terms. * `sheet` -- Queue for tasks to sync registrar updates to a Google Sheets spreadsheet, done by `SyncRegistrarsSheetAction`. +* `async-actions` -- Queue for general asynchronous actions that should be run at + some point in the future. +* `async-host-rename` -- Queue for tasks that trigger domain DNS updates upon + host renaming. +* `beam-reporting` -- Queue for tasks that wait for a Beam pipeline (such as + Spec11 reporting or invoicing) to complete. +* `console-user-group-update` -- Queue for tasks that update membership in the + Google Groups for console users. ### Scheduled cron jobs @@ -113,13 +121,65 @@ minute (in the case of syncing DNS updates) or as infrequently as once per month are more tasks that run in Production than in other environments because tasks like uploading RDE dumps are only done for the live system. +Here are the primary cron tasks configured in the production environment: + +* **`rdeStaging`** (`/_dr/task/rdeStaging`) -- Generates a full RDE escrow + deposit as a single large XML document and streams it to Google Cloud + Storage daily. +* **`rdeUpload`** (`/_dr/task/rdeUpload`) -- Uploads already-generated RDE + files from GCS to the escrow provider (e.g. Iron Mountain) via SFTP. +* **`rdeReport`** (`/_dr/task/rdeReport`) -- Uploads RDE reports to ICANN. +* **`tmchDnl`**, **`tmchSmdrl`**, **`tmchCrl`** (`/_dr/task/tmchDnl`, + `tmchSmdrl`, `tmchCrl`) -- Download the latest Domain Name Label list, + Signed Mark Revocation List, and Certificate Revocation List from MarksDB + and update the registry database. +* **`syncGroupMembers`** (`/_dr/task/syncGroupMembers`) -- Syncs registrar + contact changes from the database to Google Groups. +* **`syncRegistrarsSheet`** (`/_dr/task/syncRegistrarsSheet`) -- Synchronizes + registrar entities to a Google Sheets spreadsheet for business visibility. +* **`updateRegistrarRdapBaseUrls`** (`/_dr/task/updateRegistrarRdapBaseUrls`) + -- Reloads all registrar RDAP base URLs from ICANN. +* **`exportDomainLists`** (`/_dr/task/exportDomainLists`) -- Exports active + domain lists to GCS and Google Drive. +* **`expandBillingRecurrences`** (`/_dr/task/expandBillingRecurrences`) -- + Generates synthetic one-time billing events from recurring billing setup. +* **`deleteExpiredDomains`** (`/_dr/task/deleteExpiredDomains`) -- Deletes + domains that are past their auto-renew end date daily. +* **`sendExpiringCertificateNotificationEmail`** + (`/_dr/task/sendExpiringCertificateNotificationEmail`) -- Notifies + registrars of upcoming SSL certificate expirations. +* **`nordnUploadSunrise`**, **`nordnUploadClaims`** (`/_dr/task/nordnUpload`) + -- Upload LORDN Sunrise/Claims CSV files to MarksDB. +* **`deleteProberData`** (`/_dr/task/deleteProberData`) -- Daily cleanup of + test data generated by probers. +* **`exportReservedTerms`**, **`exportPremiumTerms`** + (`/_dr/task/exportReservedTerms`, `exportPremiumTerms`) -- Export reserved + and premium terms to Google Drive. +* **`readDnsRefreshRequests`** (`/_dr/task/readDnsRefreshRequests`) -- Reads + DNS refresh requests from the database and batches them to the publish queue + every minute. +* **`icannReportingStaging`**, **`icannReportingUpload`** + (`/_dr/task/icannReportingStaging`, `icannReportingUpload`) -- Stage monthly + ICANN activity/transaction reports and upload them. +* **`generateInvoices`** (`/_dr/task/generateInvoices`) -- Starts Dataflow + templates to generate monthly billing invoices. +* **`generateSpec11`** (`/_dr/task/generateSpec11`) -- Starts Dataflow + templates to generate daily Spec11 anti-abuse reports. +* **`bsaDownload`**, **`bsaRefresh`**, **`bsaValidate`**, + **`uploadBsaUnavailableNames`** -- Download block lists, refresh registered + names, validate data, and upload unavailable names for the Brand Safety + Alliance (BSA) service. +* **`triggerMosApiServiceState`** (`/_dr/task/triggerMosApiServiceState`) -- + Fetches the service state from MosAPI and triggers metrics status for all + TLDs every 5 minutes. + Most cron tasks use the `TldFanoutAction` which is accessed via the `/_dr/cron/fanout` URL path. This action fans out a given cron task for each TLD that exists in the registry system, using the queue that is specified in the XML entry. Because some tasks may be computationally intensive and could risk -spiking system latency if all start executing immediately at the same time, + spiking system latency if all start executing immediately at the same time, there is a `jitterSeconds` parameter that spreads out tasks over the given -number of seconds. This is used with DNS updates and commit log deletion. +number of seconds. This is used with DNS updates. The reason the `TldFanoutAction` exists is that a lot of tasks need to be done separately for each TLD, such as RDE exports and NORDN uploads. It's simpler to @@ -130,8 +190,9 @@ failures that a raw cron task does not. This is why there are some tasks that do not fan out across TLDs that still use `TldFanoutAction` -- it's so that the tasks retry in the face of transient errors. -The full list of URL parameters to `TldFanoutAction` that can be specified in -cron.xml is: +The full list of URL parameters to `TldFanoutAction` that can be specified in the +Cloud Scheduler configuration files (such as +`cloud-scheduler-tasks-production.xml`) is: * `endpoint` -- The path of the action that should be executed * `queue` -- The cron queue to enqueue tasks in. @@ -156,7 +217,7 @@ which includes a separate database and separate bulk storage in Cloud Storage. Each environment is thus completely independent. The different environments are specified in `RegistryEnvironment`. Most -correspond to a separate App Engine app except for `UNITTEST` and `LOCAL`, which +correspond to a separate GCP project except for `UNITTEST` and `LOCAL`, which by their nature do not use real environments running in the cloud. The recommended project naming scheme that has the best possible compatibility with the codebase and thus requires the least configuration is to pick a name for the @@ -185,14 +246,14 @@ real to not-real, is: the entire system down until it is completed) without affecting the QA environment. * `ALPHA` -- The developers' playground. Experimental builds are routinely - pushed here in order to test them on a real app running on App Engine. You + pushed here in order to test them on a real GKE cluster running in GCP. You may end up wanting multiple environments like Alpha if you regularly experience contention (i.e. developers being blocked from testing their code on Alpha because others are already using it). * `LOCAL` -- A fake environment that is used when running the app locally on a - simulated App Engine instance. + simulated instance. * `UNITTEST` -- A fake environment that is used in unit tests, where - everything in the App Engine stack is simulated or mocked. + everything in the cloud stack is simulated or mocked. ## Release process diff --git a/docs/authentication-framework.md b/docs/authentication-framework.md index 50c62646930..f19514ac1e0 100644 --- a/docs/authentication-framework.md +++ b/docs/authentication-framework.md @@ -185,12 +185,13 @@ architecture -- an `Authorization` HTTP header of the form "Bearer: XXXX". ### Configuration -The `auth` block of the configuration requires two fields: * -`allowedServiceAccountEmails` is the list of service accounts that should be -allowed to run tasks when internally authenticated. This will likely include -whatever service account runs Nomulus in Google Kubernetes Engine, as well as -the Cloud Scheduler service account. * `oauthClientId` is the OAuth client ID -associated with IAP. This is retrievable from the -[Clients page](https://pantheon.corp.google.com/auth/clients) of GCP after -enabling the Identity-Aware Proxy. It should look something like -`someNumbers-someNumbersAndLetters.apps.googleusercontent.com` +The `auth` block of the configuration requires two fields: + +* `allowedServiceAccountEmails` is the list of service accounts that should be + allowed to run tasks when internally authenticated. This will likely include + whatever service account runs Nomulus in Google Kubernetes Engine, as well + as the Cloud Scheduler service account. +* `oauthClientId` is the OAuth client ID associated with IAP. This is retrievable + from the [Clients page](https://pantheon.corp.google.com/auth/clients) of GCP + after enabling the Identity-Aware Proxy. It should look something like + `someNumbers-someNumbersAndLetters.apps.googleusercontent.com` diff --git a/docs/code-structure.md b/docs/code-structure.md index 7fc9dd1dbc8..5945fdc3e90 100644 --- a/docs/code-structure.md +++ b/docs/code-structure.md @@ -20,12 +20,13 @@ versions stored in the various `gradle.lockfile` files. To update these versions, run any Gradle command (e.g. `./gradlew build`) with the `--write-locks` argument. -### Generating WAR archives for deployment +### Generating Docker images for deployment -The `jetty` project is the main entry point for building the Nomulus WAR files, -and one can use the `war` gradle task to build the base WAR file. The various -deployment/release files use Docker to deploy this, in a system that is too -Google-specialized to replicate directly here. +The `jetty` project is the main entry point for building the Nomulus Docker +images. You can use the `./gradlew :jetty:buildNomulusImage` task to build the +image locally, which contains the compiled WAR files and Angular assets staged +inside a Jetty base image. You can use `./gradlew :jetty:pushNomulusImage` to +push this image to your GCR/Artifact Registry repository. ## Subprojects @@ -68,6 +69,12 @@ The following cursor types are defined: events into one-time `BillingEvent`s. * **`SYNC_REGISTRAR_SHEET`** - Tracks the last time the registrar spreadsheet was successfully synced. +* **`ICANN_UPLOAD_TX`** - Tracks monthly uploads of ICANN transaction reports. +* **`ICANN_UPLOAD_ACTIVITY`** - Tracks monthly uploads of ICANN activity reports. +* **`REMOTE_CACHE_DOMAIN_SYNC`** - Tracks the reflection of domain changes in + the remote cache. +* **`REMOTE_CACHE_HOST_SYNC`** - Tracks the reflection of host changes in + the remote cache. All `Cursor` entities in the database contain a `DateTime` that represents the next timestamp at which an operation should resume processing and a `CursorType` diff --git a/docs/configuration.md b/docs/configuration.md index 9c49fe00727..5639d666f83 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -117,16 +117,19 @@ For the Nomulus tool OAuth configuration, do the following steps: `registryTool` section. This will make the `nomulus` tool use this credential to authenticate itself to the system. -For IAP configuration, do the following steps: * **Create the IAP client ID:** -Follow similar steps from above to create an additional OAuth client ID, but -using an application type of "Web application". Note the client ID and secret. * -**Enable IAP for your HTTPS load balancer:** On the -[IAP page](https://pantheon.corp.google.com/security/iap), enable IAP for all of -the backend services that all use the same HTTPS load balancer. * **Use a custom -OAuth configuration:** For the backend services, under the "Settings" section -(in the three-dot menu) enable custom OAuth and insert the client ID and secret -that we just created * **Save the client ID:** In the configuration file, save -the client ID as `oauthClientId` in the `auth` section +For IAP configuration, do the following steps: + +* **Create the IAP client ID:** Follow similar steps from above to create an + additional OAuth client ID, but using an application type of "Web + application". Note the client ID and secret. +* **Enable IAP for your HTTPS load balancer:** On the + [IAP page](https://pantheon.corp.google.com/security/iap), enable IAP for + all of the backend services that all use the same HTTPS load balancer. +* **Use a custom OAuth configuration:** For the backend services, under the + "Settings" section (in the three-dot menu), enable custom OAuth and + insert the client ID and secret that we just created. +* **Save the client ID:** In the configuration file, save the client ID as + `oauthClientId` in the `auth` section. Once these steps are taken, the `nomulus` tool and IAP will both use client IDs which the server is configured to accept, and authentication should succeed. @@ -171,9 +174,8 @@ To create or update TLDs, we use configure_tld` command. Because the TLDs are stored as data in the running system, they do not require code pushes to update. -[app-engine-config]: https://cloud.google.com/appengine/docs/java/configuration-files -[default-config]: https://github.com/google/nomulus/blob/master/java/google/registry/config/files/default-config.yaml -[registry-config]: https://github.com/google/nomulus/blob/master/java/google/registry/config/RegistryConfig.java +[default-config]: https://github.com/google/nomulus/blob/master/core/src/main/java/google/registry/config/files/default-config.yaml +[registry-config]: https://github.com/google/nomulus/blob/master/core/src/main/java/google/registry/config/RegistryConfig.java ## Cloud SQL Configuration @@ -244,9 +246,9 @@ something similar. However, for purposes of this exercise we will push the schema from the build system. First, download the -[Cloud SQL Proxy](https://cloud.google.com/sql/docs/mysql/sql-proxy). This will -allow you to connect to your database from a local workstation without a lot of -additional configuration. +[Cloud SQL Auth Proxy](https://cloud.google.com/sql/docs/postgres/sql-proxy). +This will allow you to connect to your database from a local workstation without +a lot of additional configuration. Create a service account for use with the proxy: @@ -277,12 +279,11 @@ Now start the proxy: ``` $ PORT=3306 # Use a different value for this if you like. -$ ./cloud_sql_proxy -credential_file=sql-admin.json \ - -instances=$PROJECT_ID:nomulus=tcp:$PORT -2020/07/01 12:11:20 current FDs rlimit set to 32768, wanted limit is 8500. Nothing to do here. -2020/07/01 12:11:20 using credential file for authentication; email=sql-proxy@pproject-id.iam.gserviceaccount.com -2020/07/01 12:11:20 Listening on 127.0.0.1:3306 for project-id:nomulus -2020/07/01 12:11:20 Ready for new connections +$ ./cloud-sql-proxy --credentials-file=sql-admin.json --port=$PORT \ + $PROJECT_ID:us-central1:nomulus +2026/06/16 12:11:20 Authorizing with credentials file: sql-admin.json +2026/06/16 12:11:20 Listening on 127.0.0.1:3306 for project-id:us-central1:nomulus +2026/06/16 12:11:20 The proxy has started successfully and is ready for new connections! ``` Finally, upload the new database schema: @@ -366,8 +367,8 @@ $ nomulus -e $ENV update_keyring_secret --keyname TOOLS_CLOUD_SQL_PASSWORD \ Use get_keyring_secret command to verify the data you put in: ``` -$ nomulus -e alpha -e alpha get_keyring_secret --keyname CLOUD_SQL_PASSWORD +$ nomulus -e alpha get_keyring_secret --keyname CLOUD_SQL_PASSWORD [your password] -$ nomulus -e alpha -e alpha get_keyring_secret --keyname CLOUD_SQL_PASSWORD +$ nomulus -e alpha get_keyring_secret --keyname TOOLS_CLOUD_SQL_PASSWORD [your password] ``` diff --git a/docs/first-steps-tutorial.md b/docs/first-steps-tutorial.md index 2d7684d42bf..cea8ac085a9 100644 --- a/docs/first-steps-tutorial.md +++ b/docs/first-steps-tutorial.md @@ -21,16 +21,19 @@ it'll never be created for real on the Internet at large. Then, [example template](https://github.com/google/nomulus/blob/master/core/src/test/resources/google/registry/tools/tld.yaml) as a guide. -The fields you'll want to change from the template: * `driveFolderId` should be -null * `roidSuffix` should be `EXAMPLE` -- this is the suffix that will be used -for repository ids of domains on the TLD. This suffix must be all uppercase and -a maximum of eight ASCII characters and can be set to the upper-case equivalent -of our TLD name (if it is 8 characters or fewer), such as "EXAMPLE." You can -also abbreviate the upper-case TLD name down to 8 characters. Refer to the -[gTLD Registry Advisory: Correction of non-compliant ROIDs][roids] for further -information. * `tldStr` should be `example` * `tldType` should be `TEST`, which -identifies that the TLD is for testing purposes, whereas `REAL` would identify -the TLD as a live TLD +The fields you'll want to change from the template: + +* `driveFolderId` should be null. +* `roidSuffix` should be `EXAMPLE` -- this is the suffix that will be used + for repository ids of domains on the TLD. This suffix must be all uppercase and + a maximum of eight ASCII characters and can be set to the upper-case equivalent + of our TLD name (if it is 8 characters or fewer), such as "EXAMPLE." You can + also abbreviate the upper-case TLD name down to 8 characters. Refer to the + [gTLD Registry Advisory: Correction of non-compliant ROIDs][roids] for further + information. +* `tldStr` should be `example`. +* `tldType` should be `TEST`, which identifies that the TLD is for testing purposes, + whereas `REAL` would identify the TLD as a live TLD. ```shell $ nomulus -e alpha configure_tld --input=example.yaml diff --git a/docs/gradle.md b/docs/gradle.md index 59dce7d78af..3d67deb0715 100644 --- a/docs/gradle.md +++ b/docs/gradle.md @@ -34,6 +34,10 @@ providing the test project as an argument, e.g. ./gradlew deployNomulus -Penvironment=alpha ``` +Note: Deploying to GCP requires Docker to be running locally (to build the +Nomulus container image) and `gcloud` credentials to be configured with access +to the target GCP project. + ### Notable Issues Test suites (RdeTestSuite and TmchTestSuite) are ignored to avoid duplicate diff --git a/docs/install.md b/docs/install.md index dad31b6f9e1..d46468b8166 100644 --- a/docs/install.md +++ b/docs/install.md @@ -6,9 +6,9 @@ This document covers the steps necessary to download, build, and deploy Nomulus. You will need the following programs installed on your local machine: -* A recent version of the [Java 21 JDK][java-jdk21]. +* A recent version of the [Java 25 JDK][java-jdk25]. * The [Google Cloud CLI](https://docs.cloud.google.com/sdk/docs/install-sdk) - (configure an alias to the `gcloud`utility, because you'll use it a lot) + (configure an alias to the `gcloud` utility, because you'll use it a lot) * [Git](https://git-scm.com/) version control system. * Docker (confirm with `docker info` no permission issues, use `sudo groupadd docker` for sudoless docker). @@ -63,7 +63,7 @@ while. ## Create and configure a GCP project First, -[create an application](https://cloud.google.com/appengine/docs/java/quickstart) +[create a project][create-project] on Google Cloud Platform. Make sure to choose a good Project ID, as it will be used repeatedly in a large number of places. If your company is named Acme, then a good Project ID for your production environment would be "acme-registry". Keep @@ -123,10 +123,14 @@ $ gcloud container clusters create proxy-cluster \ --num-nodes=3 \ --enable-ip-alias ``` +Then create an artifact repository: -Then create an artifact repository: `shell $ gcloud artifacts repositories -create nomulus-repo \ --repository-format=docker \ --location=$REGION \ ---description="Nomulus Docker images"` +```shell +$ gcloud artifacts repositories create nomulus-repo \ + --repository-format=docker \ + --location=$REGION \ + --description="Nomulus Docker images" +``` See the files and documentation in the `release/` folder for more information on the release process. You will likely need to customize the internal build @@ -141,7 +145,8 @@ can rebuild and start using the `nomulus` tool to create test entities in your newly deployed system. See the [first steps tutorial](./first-steps-tutorial.md) for more information. -[java-jdk21]: https://www.oracle.com/java/technologies/javase-downloads.html +[java-jdk25]: https://www.oracle.com/java/technologies/javase-downloads.html +[create-project]: https://cloud.google.com/resource-manager/docs/creating-managing-projects ## Deploy the Beam Pipelines diff --git a/docs/local-testing.md b/docs/local-testing.md index 03ce092e65b..cefd0ec2943 100644 --- a/docs/local-testing.md +++ b/docs/local-testing.md @@ -9,9 +9,10 @@ useful for doing web UI development (i.e. the registrar console). It allows you to update Typescript, HTML, and CSS and see the changes simply by refreshing the relevant page in your browser. -In order to serve content locally, there are two services that must be run: * -the `RegistryTestServer` to serve as the backing server * the Angular service to -provide the UI files +In order to serve content locally, there are two services that must be run: + +* The `RegistryTestServer` to serve as the backing server. +* The Angular service to provide the UI files. In order to do this in one step, from the `console-webapp` folder, run: @@ -24,7 +25,7 @@ This will start both the `RegistryTestServer` and the Angular testing service. Any changes to Typescript/HTML/CSS files will be recompiled and available on page reload. -One it is running, you can interact with the console by going to +Once it is running, you can interact with the console by going to `http://localhost:4200` to view the registrar console in a web browser. The server will continue running until you terminate the process. diff --git a/docs/operational-procedures.md b/docs/operational-procedures.md index e2eedd6ea28..27e613e8b38 100644 --- a/docs/operational-procedures.md +++ b/docs/operational-procedures.md @@ -10,22 +10,37 @@ instrument internal state within the Nomulus internal environment. This is broadly called white-box monitoring. EPP, DNS, and RDAP are instrumented. The metrics monitored are as follows: -* `/custom/dns/publish_domain_requests` -- A count of publish domain requests, - described by the target TLD and the return status code from the underlying - DNS implementation. -* `/custom/dns/publish_host_requests` -- A count of publish host requests, - described by the target TLD and the return status code from the underlying - DNS implementation. -* `/custom/epp/requests` -- A count of EPP requests, described by command - name, client id, and return status code. -* `/custom/epp/processing_time` -- A [Distribution][distribution] representing - the processing time for EPP requests, described by command name, client id, +* `/dns/publish_domain_requests` -- A count of publish domain requests, + described by the target TLD and the publish status. +* `/dns/publish_host_requests` -- A count of publish host requests, + described by the target TLD and the publish status. +* `/epp/requests` -- A count of EPP requests, described by command + name, client (registrar) id, and return status code. +* `/epp/request_time` -- A [Distribution][distribution] representing + the processing time for EPP requests, described by command name, traffic type, and return status code. -* `/custom/rdap/requests` -- A count of RDAP requests, described by command - name, number of returned results, and return status code. -* `/custom/rdap/processing_time` -- A [Distribution][distribution] - representing the processing time for RDAP requests, described by command - name, number of returned results, and return status code. +* `/rdap/requests` -- A count of RDAP requests, described by endpoint + type, deleted inclusion, registrar specification, authorization, and + HTTP method. +* `/rdap/request_time` -- A [Distribution][distribution] + representing the processing time for RDAP requests, described by endpoint + type, search type, wildcard type, HTTP status code, and + incompleteness warning type. +* `/lock/acquire_lock_requests` -- A count of lock acquisition attempts, + described by TLD, resource name, and the existing lock state. +* `/lock/lock_duration` -- A [Distribution][distribution] representing + the lock lifetime in milliseconds, described by TLD and resource name. +* `/cache/lookups` -- A count of cache lookups, described by cache name + (e.g. domain, host) and the hit type (LOCAL, REMOTE, MISS, + MISS_NONEXISTENT). +* `/domain_label/reserved/checks` -- A count of reserved list checks, + described by TLD, number of matching lists, most severe list name, and + most severe reservation type. +* `/domain_label/reserved/processing_time` -- A [Distribution][distribution] + representing the amount of time in milliseconds required to check a label + against all reserved lists. +* `/domain_label/reserved/hits` -- A count of reserved list hits, + described by TLD, reserved list name, and the reservation type found. Follow the guide to [set up a Stackdriver account](https://cloud.google.com/monitoring/accounts/guide) @@ -60,11 +75,11 @@ Cursors can be updated as follows: ```shell $ nomulus -e {ENVIRONMENT} update_cursors exampletld --type RDE_STAGING \ --timestamp 2016-09-01T00:00:00Z -Update Cursor@ahFzfmRvbWFpbi1yZWdpc3RyeXIzCxIPRW50aXR5R3JvdXBSb290Igljcm9zcy10bGQMCxIIUmVnaXN0cnkiB3lvdXR1YmUM_RDE_STAGING -cursorTime: 2016-09-23T00:00:00.000Z -> 2016-09-01T00:00:00.000Z +Change cursorTime of RDE_STAGING for Scope:exampletld to 2016-09-01T00:00:00Z Perform this command? (y/N): Y -Updated 1 entities. +Running ... +Updated 1 cursors. ``` ## gTLD reporting @@ -89,9 +104,11 @@ Nomulus provides ICANN requires monthly activity and transaction reporting. The details are contained in Specification 3 of the [registry agreement][registry-agreement]. -These reports are mostly generated by querying the Cloud SQL database. There is -currently a Google proprietary class to query DNS related activities that is not -included in the open source Nomulus release. +These reports are generated by querying BigQuery, using database snapshots +loaded into BigQuery. The default `DnsCountQueryCoordinator` implementation +(`CloudDnsCountQueryCoordinator`) relies on Google-internal DNS tables, so +external users will need to provide their own implementation to query their DNS +statistics. ### Zone File Access (ZFA) diff --git a/docs/operational-procedures/brda-deposits.md b/docs/operational-procedures/brda-deposits.md index c6c198985f7..7592a32f0d4 100644 --- a/docs/operational-procedures/brda-deposits.md +++ b/docs/operational-procedures/brda-deposits.md @@ -7,7 +7,7 @@ deposits). Some information related to BRDA can be found at: https://icannwiki.com/Onboarding_Information_Request#BRDA BRDA deposits are generated by the -[RdeStagingAction](https://github.com/google/nomulus/blob/master/java/google/registry/rde/RdeStagingAction.java) +[RdeStagingAction][rde-staging-action] job. This is the same job that generates RDE deposits. Its Javadoc goes into great detail about how it's implemented. @@ -44,24 +44,54 @@ The cursor can be checked using the `nomulus pending_escrow` command. command output doesn't contain any TLDs for tests. ```shell -$ nomulus -e production list_tlds --fields=tldStr,tldType | grep REAL | awk '{print $1}' > realtlds.txt` +$ nomulus -e production list_tlds --fields=tldStr,tldType \ + | grep REAL | awk '{print $1}' > realtlds.txt ``` -* Generate .ryde and .sig files of TLDs specified for given date(s) in the - current directory. +* Kick off the server-side generation of thin escrow XML files (GhostRyDE + encrypted) under the GCS RDE bucket manual directory: ```shell -$ mkdir /tmp/brda.$$; for date in 2015-02-26 2015-03-05; \ +$ for date in 2015-02-26 2015-03-05; \ do for tld in $(cat realtlds.txt); \ - do nomulus -e production create_brda_deposit --tld=${tld} --watermark=${date}T00:00:00Z --outdir=/tmp/brda.$$ & sleep 30; \ + do nomulus -e production generate_escrow_deposit \ + --tld=${tld} \ + --watermark=${date}T00:00:00Z \ + --mode=THIN \ + --outdir=manual_brda; \ done; \ done ``` -* Store the generated files to the GCS bucket. +* Download and decrypt the GhostRyDE files locally, then encrypt them for + sending to the escrow provider (note that files are located in + subdirectories named after the Dataflow job under `manual_brda`, which + we match using a wildcard): ```shell -$ gcloud storage cp /tmp/brda.$$/*.{ryde,sig} gs://{PROJECT-ID}-icann-brda/` +$ mkdir /tmp/brda_out; for date in 2015-02-26 2015-03-05; \ + do for tld in $(cat realtlds.txt); \ + do \ + gcloud storage cat \ + gs://{PROJECT-ID}-rde/manual/manual_brda/*/${tld}_${date}_thin_S1_R0.xml.ghostryde \ + | nomulus -e production ghostryde --decrypt \ + > /tmp/${tld}_${date}_thin_S1_R0.xml; \ + nomulus -e production encrypt_escrow_deposit \ + --mode=THIN \ + --tld=${tld} \ + --input=/tmp/${tld}_${date}_thin_S1_R0.xml \ + --outdir=/tmp/brda_out; \ + rm /tmp/${tld}_${date}_thin_S1_R0.xml; \ + done; \ + done +``` + +* Store the generated `.ryde` and `.sig` files to the BRDA GCS bucket. + +```shell +$ gcloud storage cp /tmp/brda_out/*.{ryde,sig} gs://{PROJECT-ID}-icann-brda/ ``` * Mirror the files in the GCS bucket to the sFTP server. + +[rde-staging-action]: https://github.com/google/nomulus/blob/master/core/src/main/java/google/registry/rde/RdeStagingAction.java diff --git a/docs/operational-procedures/premium-list-management.md b/docs/operational-procedures/premium-list-management.md index 7dc1733cec0..6c861933f03 100644 --- a/docs/operational-procedures/premium-list-management.md +++ b/docs/operational-procedures/premium-list-management.md @@ -40,11 +40,13 @@ Once the file containing the premium prices is ready, run the `create_premium_list` command to load it into the database as follows: ```shell -$ nomulus -e {ENVIRONMENT} create_premium_list -n exampletld -i exampletld.txt +$ nomulus -e {ENVIRONMENT} create_premium_list -n exampletld \ + -i exampletld.txt -c USD -You are about to save the premium list exampletld with 2 items: +Create new premium list for exampletld? Perform this command? (y/N): y -Successfully saved premium list exampletld +Running ... +Saved premium list exampletld with 2 entries. ``` `-n` is the name of the list to be created, and `-i` is the input filename. Note @@ -64,9 +66,12 @@ from a text file, the procedure is exactly the same, except using the ```shell $ nomulus -e {ENVIRONMENT} update_premium_list -n exampletld -i exampletld.txt -You are about to save the premium list exampletld with 2 items: +Update premium list for exampletld? + Old List: PremiumList{name=exampletld, ...} + New List: PremiumList{name=exampletld, ...} Perform this command? (y/N): y -Successfully saved premium list exampletld +Running ... +Saved premium list exampletld with 2 entries. ``` ### Note: @@ -140,8 +145,11 @@ $ nomulus -e production check_domain {domain_name} **Note that the list can be cached for up to 60 minutes, so the old value may still be returned for a little while**. If it is urgent that the new pricing -changes be applied, and it's OK to potentially interrupt client connections, -then you can use the GCP web console to kill instances of the `frontend` -service, as the cache is per-instance. Once you've killed all the existing -instances (don't kill them all at once!), all the newly spun up instances will -now be using the new values you've configured. +changes be applied, you can perform a rolling restart of the `frontend` service +deployment: + +```shell +$ kubectl rollout restart deployment frontend +``` + +This will cycle the pods and clear the per-instance caches without causing downtime. diff --git a/docs/operational-procedures/rde-deposits.md b/docs/operational-procedures/rde-deposits.md index 33b9fc2cecd..cfb03663b20 100644 --- a/docs/operational-procedures/rde-deposits.md +++ b/docs/operational-procedures/rde-deposits.md @@ -9,11 +9,11 @@ have this requirement, in which case the following would not apply. The RDE process takes care of escrow deposit processing. It happens in three phases: -1. [Staging](https://github.com/google/nomulus/blob/master/java/google/registry/rde/RdeStagingAction.java): +1. [Staging][rde-staging-action]: Generate XML deposit and XML report files on Google Cloud Storage. -2. [Upload](https://github.com/google/nomulus/blob/master/java/google/registry/rde/RdeUploadAction.java): +2. [Upload][rde-upload-action]: Transmit XML deposit to the escrow provider via sFTP. -3. [Report](https://github.com/google/nomulus/blob/master/java/google/registry/rde/RdeReportAction.java): +3. [Report][rde-report-action]: Transmit XML *report* file to ICANN via HTTPS. Each phase happens with an GCP task queue entry that retries on failure. When @@ -101,7 +101,7 @@ that no cooldown period is necessary. You can list the files in Cloud Storage for a given TLD using the gcloud storage tool. All files are stored in the {PROJECT-ID}-rde bucket, where {PROJECT-ID} is -the name of the App Engine project for the particular environment you are +the name of the GCP project for the particular environment you are checking. ```shell @@ -114,7 +114,7 @@ gs://{PROJECT-ID}-rde/zip_2015-05-16.xml.length ## Normal launch Under normal circumstances, RDE is launched by TldFanoutAction, configured in -cron.xml. If the App Engine's cron executor isn't working, you can spawn it +Cloud Scheduler. If the Cloud Scheduler trigger fails, you can spawn it manually by visiting the following URL: ``` @@ -192,8 +192,7 @@ In general RDE files should be regenerated by updating the cursors to before the desired date and then re-running the mapreduce (either by waiting until the next scheduled cron execution, or by manually invoking the RDE staging action). -In very rare cases (and only for small TLDs) you can use `nomulus` to generate -the RDE files on a per-TLD basis. Here's an example: +In very rare cases, you can use `nomulus` to trigger the RDE generation manually on a per-TLD basis. Here's how: ```shell # Define the tld/date combination. @@ -201,17 +200,34 @@ the RDE files on a per-TLD basis. Here's an example: $ tld=xxx $ date=2015-05-16 -# 1. Generate the deposit. This drops a bunch of files in the current directory. +# 1. Trigger the server-side staging. This enqueues a Cloud Task that runs +# RdeStagingAction on GKE, writing the generated GhostRyDE files to the GCS RDE +# bucket manual directory. -$ nomulus -e production generate_escrow_deposit --tld=${tld} --watermark=${date}T00:00:00Z +$ nomulus -e production generate_escrow_deposit \ + --tld=${tld} \ + --watermark=${date}T00:00:00Z \ + --outdir=manual_rde -$ ls -l -total 22252 --rw-r----- 1 tjb eng     2292 May 19 16:49 xxx_2015-05-16_full_S1_R0-report.xml --rw-r----- 1 tjb eng  1321343 May 19 16:49 xxx_2015-05-16_full_S1_R0.ryde --rw-r----- 1 tjb eng      361 May 19 16:49 xxx_2015-05-16_full_S1_R0.sig --rw-r----- 1 tjb eng 21448005 May 19 16:49 xxx_2015-05-16_full_S1_R0.xml --rw-r----- 1 tjb eng      977 May 19 16:49 xxx.pub +# 2. Download and decrypt the staged GhostRyDE files locally for inspection or validation. +# Note that the files will be stored in a subdirectory named after the Dataflow job +# (you can find the job ID in the output of the previous command, or by running +# `gcloud storage ls gs://{PROJECT-ID}-rde/manual/manual_rde/`): + +$ JOB_NAME=rde-2015-05-16t000000z-some-suffix +$ gcloud storage cp \ + gs://{PROJECT-ID}-rde/manual/manual_rde/${JOB_NAME}/${tld}_${date}_full_S1_R0.xml.ghostryde . +$ gcloud storage cp \ + gs://{PROJECT-ID}-rde/manual/manual_rde/${JOB_NAME}/${tld}_${date}_full_S1_R0-report.xml.ghostryde . + +$ nomulus -e production ghostryde --decrypt \ + --input=${tld}_${date}_full_S1_R0.xml.ghostryde \ + --output=${tld}_${date}_full_S1_R0.xml +$ nomulus -e production ghostryde --decrypt \ + --input=${tld}_${date}_full_S1_R0-report.xml.ghostryde \ + --output=${tld}_${date}_full_S1_R0-report.xml + +# 3. Validate the decrypted XML file: $ nomulus -e production validate_escrow_deposit -i ${tld}_${date}_full_S1_R0.xml ID: AAAACTK2AU2AA @@ -220,30 +236,24 @@ Type: FULL Watermark: 2015-05-16T00:00:00.000Z RDE Version: 1.0 RDE Object URIs: -  - urn:ietf:params:xml:ns:rdeDomain-1.0 -  - urn:ietf:params:xml:ns:rdeHeader-1.0 -  - urn:ietf:params:xml:ns:rdeHost-1.0 -  - urn:ietf:params:xml:ns:rdeRegistrar-1.0 + - urn:ietf:params:xml:ns:rdeDomain-1.0 + - urn:ietf:params:xml:ns:rdeHeader-1.0 + - urn:ietf:params:xml:ns:rdeHost-1.0 + - urn:ietf:params:xml:ns:rdeRegistrar-1.0 Contents: -  - XjcRdeDomain: 2,667 entries -  - XjcRdeHeader: 1 entry -  - XjcRdeHost: 35,932 entries -  - XjcRdeRegistrar: 146 entries + - XjcRdeDomain: 2,667 entries + - XjcRdeHeader: 1 entry + - XjcRdeHost: 35,932 entries + - XjcRdeRegistrar: 146 entries RDE deposit is XML schema valid +``` -# 2. GhostRyDE it! - -$ nomulus -e production ghostryde --encrypt \ -    --input=${tld}_${date}_full_S1_R0-report.xml \ -    --output=${tld}_${date}_full_S1_R0-report.xml.ghostryde - -$ nomulus -e production ghostryde --encrypt \ -    --input=${tld}_${date}_full_S1_R0.xml \ -    --output=${tld}_${date}_full_S1_R0.xml.ghostryde - -# 3. Copy to Cloud Storage so RdeUploadTask can find them. +If you need the files to be picked up by the regular upload task, copy them +back to the root of the RDE bucket: -$ gcloud storage cp ${tld}_${date}_full_S1_R0{,-report}.xml.ghostryde gs://{PROJECT-ID}-rde/ +```shell +$ gcloud storage cp \ + ${tld}_${date}_full_S1_R0{,-report}.xml.ghostryde gs://{PROJECT-ID}-rde/ ``` ## Updating an RDE cursor @@ -262,7 +272,7 @@ $ nomulus -e production update_cursors --timestamp=2015-05-21T00:00:00Z --type=R These instructions work for Iron Mountain, and should be applicable to other escrow providers as well. We upload the RDE deposits to an sFTP server (see the -[ConfigModule](https://github.com/google/nomulus/blob/master/java/google/registry/config/ConfigModule.java) +[RegistryConfig][registry-config] for specific URLs). First, you need a generated deposit .xml file (see above for how to generate such a file locally, or how to decrypt a .ghostryde file into the .xml original). @@ -270,10 +280,11 @@ the .xml original). ### Encrypting the RDE deposits for sending to the escrow provider ```shell -$ nomulus -e production encrypt_escrow_deposit --tld=$tld --input=${tld}_${date}_full_S1_R0-report.xml +$ nomulus -e production encrypt_escrow_deposit --tld=$tld \ + --input=${tld}_${date}_full_S1_R0.xml $ ls *.ryde *.sig - ${tld}_${date}_full_S1_R0-report.ryde -  ${tld}_${date}_full_S1_R0-report.sig + ${tld}_${date}_full_S1_R0.ryde + ${tld}_${date}_full_S1_R0.sig ``` ### Verifying the deposit signature (optional) @@ -283,7 +294,7 @@ To verify the deposit signature, you will need a file containing the public key. ```shell $ (umask 0077; mkdir gpgtemp) $ GNUPGHOME=gpgtemp gpg --import ./rde-signing-public -$ GNUPGHOME=gpgtemp gpg --verify ${tld}_${date}_full_S1_R0-report.{sig,ryde} +$ GNUPGHOME=gpgtemp gpg --verify ${tld}_${date}_full_S1_R0.{sig,ryde} ... gpg: Good signature from ... ... @@ -328,3 +339,8 @@ notification report to ICANN: # This command assumes the report XML file is in your current working directory $ nomulus -e production send_escrow_report_to_icann xxx_2015-05-16_full_S1_R0-report.xml ``` + +[rde-staging-action]: https://github.com/google/nomulus/blob/master/core/src/main/java/google/registry/rde/RdeStagingAction.java +[rde-upload-action]: https://github.com/google/nomulus/blob/master/core/src/main/java/google/registry/rde/RdeUploadAction.java +[rde-report-action]: https://github.com/google/nomulus/blob/master/core/src/main/java/google/registry/rde/RdeReportAction.java +[registry-config]: https://github.com/google/nomulus/blob/master/core/src/main/java/google/registry/config/RegistryConfig.java diff --git a/docs/operational-procedures/reserved-list-management.md b/docs/operational-procedures/reserved-list-management.md index 5465278f78b..7bc45a663af 100644 --- a/docs/operational-procedures/reserved-list-management.md +++ b/docs/operational-procedures/reserved-list-management.md @@ -75,9 +75,11 @@ purposes of this example, we are creating a common reserved list named ```shell $ nomulus -e {ENVIRONMENT} create_reserved_list -i common_bad-words.txt -[ ... snip long confirmation prompt ... ] +reservedListMap=[(availableinga, ALLOWED_IN_SUNRISE), + (reserveddomain, FULLY_BLOCKED), ...] Perform this command? (y/N): y -Updated 1 entities. +Running ... +Saved reserved list common_bad-words with 2 entries. ``` Note that `-i` is the input file containing the list. You can optionally specify @@ -97,9 +99,11 @@ file containing the reserved list entries, then pass it as input to the ```shell $ nomulus -e {ENVIRONMENT} update_reserved_list -i common_bad-words.txt -[ ... snip diff of changes to list entries ... ] +Update reserved list for common_bad-words? +[diff of changes...] Perform this command? (y/N): y -Updated 1 entities. +Running ... +Saved reserved list common_bad-words with 2 entries. ``` Note that, like the create command, the name of the list is inferred from the @@ -168,9 +172,11 @@ $ nomulus -e production check_domain {domain_name} ``` **Note that the list can be cached for up to 60 minutes, so changes may not take -place immediately**. If it is urgent that the new changes be applied, and it's -OK to potentially interrupt client connections, then you can use the GCP web -console to kill instances of the `frontend` service, as the cache is -per-instance. Once you've killed all the existing instances (don't kill them all -at once!), all the newly spun up instances will now be using the new values -you've configured. +place immediately**. If it is urgent that the new changes be applied, you can +perform a rolling restart of the `frontend` service deployment: + +```shell +$ kubectl rollout restart deployment frontend +``` + +This will cycle the pods and clear the per-instance caches without causing downtime. diff --git a/docs/operational-procedures/tld-security-restrictions.md b/docs/operational-procedures/tld-security-restrictions.md index a7458fb4d3a..d57d1be9ee5 100644 --- a/docs/operational-procedures/tld-security-restrictions.md +++ b/docs/operational-procedures/tld-security-restrictions.md @@ -31,5 +31,5 @@ allowedFullyQualifiedHostNames: When nameserver restrictions are set on a TLD, any domain mutation flow under that TLD will verify that the supplied nameservers are not empty and that they -are a strict subset of the allowed nameservers and registrants on the TLD. If no +are a strict subset of the allowed nameservers on the TLD. If no restrictions are set, domains can be created or updated without nameservers. diff --git a/docs/proxy-setup.md b/docs/proxy-setup.md index 97f96616b09..1a01980e62d 100644 --- a/docs/proxy-setup.md +++ b/docs/proxy-setup.md @@ -452,7 +452,7 @@ immediately after. ```bash $ gcloud container clusters create proxy-americas-cluster --enable-autorepair \ --enable-autoupgrade --enable-autoscaling --max-nodes=3 --min-nodes=1 \ ---zone=us-east1-c --cluster-version=1.9.4-gke.1 --tags=proxy-cluster \ +--zone=us-east1-c --tags=proxy-cluster \ --service-account= ``` diff --git a/docs/rdap.md b/docs/rdap.md index 85a955ceb18..fb830efa8bf 100644 --- a/docs/rdap.md +++ b/docs/rdap.md @@ -43,8 +43,8 @@ containing the requested data. ## Nomulus RDAP request endpoints The suite of URL endpoint paths is listed below. The paths should be tacked onto -the usual App Engine server name. For example, if the App Engine project ID is -`project-id`, the full path for a domain lookup of domain iam.soy would be: +the pubapi workload host name. For example, if the base domain is +`mydomain.com`, the full path for a domain lookup of domain iam.soy would be: ``` https://pubapi.mydomain.com/rdap/domain/iam.soy diff --git a/docs/registrar-faq.md b/docs/registrar-faq.md index 4dd9feedd8e..85fc91b04ee 100644 --- a/docs/registrar-faq.md +++ b/docs/registrar-faq.md @@ -524,10 +524,10 @@ need to specify two items in requests, one for the normal price, one for the Early Access Fee. These should be specified as in the following example: -``` - +```xml + USD - 70 + 70.00 80.00 ```