Monday, February 26, 2018

Optimistic Concurrency Control (22)

Last post, I put in code to stop incorrect changes being made by malicious actors.

This post, I want to put in code to stop incorrect changes being made by innocent actors.

An example is the case that the form is submitted twice "at the same time", perhaps because it was opened on two different computers, submitted on the first, then later submitted on the second (or different browser tabs on the same computer). (I hit this bug with reddit quite often, and my bug submission there is one of the most disappointing bug responses I've ever encountered).

Because the was was opened the second time before the first instance was submitted, the user interface presents it as editable and waiting for submission, and continues to do so after the first instance has been submitted - for as long as you don't reload that page on the second computer.

A pretty common technique to deal with this is called optimistic concurrency control (OCC).

OCC assigns a version identifier to each version of the data. When a change is submitted, it contains the version against which the change is being made. If that version identifier isn't the same as the version in the database, the change is aborted.

If you're used to version control systems like git, you can imagine this as a very lightweight version of git's commit identifiers. Instead of storing the whole commit history, only the latest head is stored; and many of the interesting things git can do, like reconcile multiple changes against a history commit by merging can't happen in this system. (if you don't know about git, ignore this paragraph)

If you're used to transactions - this is a transaction system which can only enforce consistency against a single versioned record: you can't do SQL-style transactional changes to multiple records (unless they all have one single shared version identifier - which increases the likelihood of conflict).

It doesn't matter too much what database type is used for version identifiers, as long as you can generate new ones, and compare them.

I'm going to use a timestamp. This is quite a complicated type, but allows the field to double as a possibly-useful last-modified time.

First use the database migration system to update the registration table so that each record has a version. It doesn't matter what we use as an initial value for existing records; I'll use the current time.

In migrations/0006-occ.sql:

ALTER TABLE registration ADD COLUMN occ TIMESTAMP WITH TIME ZONE;
UPDATE registration SET occ = NOW();
ALTER TABLE registration ALTER COLUMN occ SET NOT NULL;

Without any further changes, we now can't create any new registrations although we can modify existing ones.

The Haskell-side Registration type now needs to know about this.

There's a postgresql-simple type for storing timestamps, which we can use in the definition of Registration.

import qualified Database.PostgreSQL.Simple.Time as PGT
...
data Registration = Registration {
...
    occ :: PGT.ZonedTimestamp,
...
}

There's immediately a build error:

    * No instance for (CSV.ToField
                         (postgresql-simple-0.5.3.0:Database.PostgreSQL.Simple.Time.Implementation.Unbounded
                            time-1.8.0.2:Data.Time.LocalTime.Internal.ZonedTime.ZonedTime))

... which is complaining that cassava doesn't know how to write a Registration out to CSV format any more - specifically because I doesn't know how to write out postgresql-simple timestamps.

Again I'm going to avoid wrapping this into its own type and specifying instances there. Instead I'm going to add a definition to our shifty Orphans.hs file:

{-# Language FlexibleInstances #-}
{-# Language TypeSynonymInstances #-}
...
import Database.PostgreSQL.Simple.Time as PGT
...
instance CSV.ToField PGT.ZonedTimestamp
  where
    toField time = CSV.toField (show time)

Those Language pragmas are needed because ZonedTimestamp is actually an alias for a more complicated type that can't by default have typeclass instances declared on it. This instance will write the timestamp to CSV as a string.

In Main.hs, the only place we create a Registration from scratch is in doInvitation. When constructing that registration we'll need an initial database timestamp. One lazy way to do that is go ask the database for one. (Another way would be to ask the Haskell runtime for the current time, and convert it into a ZonedTimestamp)

In src/DB.hs, the place for database library functions, add this:

import qualified Database.PostgreSQL.Simple.Time as PGT
...
generateOCC :: IO PGT.ZonedTimestamp
generateOCC = do
  [[n]] <- withDB $ \conn -> PG.query conn "SELECT NOW()" ()
  return n

... which will ask the database what time is NOW().

Now in doInvite we can add:

...
  newOCC <- generateOCC
...

     nonce = Just newNonce,
     occ = newOCC,
     email = Just (I.email invitation),
...

In registrationDigestiveForm we can add this OCC field initially as a constant value that cannot be modified:

registrationDigestiveForm initial = do
  Registration
...
    <*> (pure . occ) initial
...

What we should have now is a registration which generates an initial OCC version identifier when a registration is initially added, but doesn't ever change it or act on it. Nevertheless the code should compile and run at this point.

Now I want to actually use this new field to do some concurrency control.

Rather than this field being a constant loaded from the database at each iteration of form processing (so always having the latest value), instead I want to send it to the user's browser as part of the form so that when a form submission comes back, I know which version of the form data they were modifying.

digestive-functors provides DF.stringRead to define a form field that can be converted to/from string representation using the standard Show and Read typeclasses.

In registrationDigestiveForm, get rid of the pure . occ implementation of the OCC field, and instead use

...
    <*> "occ" .: DF.stringRead "OCC Version" (Just $ occ initial)
...

This new string field can be invisibly sent as part of the HTML form, to be returned as a POST parameter using inputHidden from digestive-functors-blaze.

Add this:

      inputHidden "occ" view

... anywhere in the HTML form definition in htmlForRegistration.

If you build and compile that, and look at the HTML page source of a registration form, you should see something like this:

<input type="hidden" id="Registration.occ" name="Registration.occ" value="2018-02-25 18:15:44.392519 +0000">

... which will be sent back on each submission.

... but this is still not doing any checking...

I'm going to check in two places: in the user interface where nice errors can be presented to the user; and nearer the database where I can be a bit surer that there are not race conditions.

I'll implement the database level checking first.

At present, updates happen with a call like this:

  withDB $ \conn -> gupdateInto conn "registration" "nonce = ?" n [identifier]

which returns the number of rows that have been updated, and then ignores that number.

I'm going to change the WHERE clause of that update: instead of only selecting rows with the appropriate nonce, I'm going to further narrow down to rows with the correct OCC version; and when writing out the new row update the OCC (in the same way that the status is updated); and then check that exactly one row was updated.

import Control.Monad (when)
...
    (_, Just newRegistration) -> do
      let oldOcc = occ newRegistration
      newOcc <- liftIO $ generateOCC
      let n = newRegistration { status = "C", occ = newOcc}
      rows <- withDB $ \conn -> gupdateInto conn "registration" "nonce = ? AND occ = ?" n (identifier, occ newRegistration)
      when (rows /= 1) $ error "Registration POST did not update exactly one row"
      return "Record updated."

That rows /= 1 is quite rough: there are other reasons why it might not work, such as the registration being duplicated, or not in the database at all any more. But it will do for now.

With this in place, open the same registration form in two browser tabs, and try to submit changes from first one, then the other. The first submission should work, and the second should fail.

So we get an ugly error message.

Another place we can check this is in form validation. This has a race condition, though, which is why the above database check is necessary: two submissions of the same form could both be validated, before the two submissions hit the database. That's quite unlikely though so I haven't put much effort into the error message above.

More likely that race condition won't happen and we will be fine with form-level validation.

In registrationDigestiveForm, put the field definition for "occ" in its own function:

    <*> "occ" .: occVersion (occ initial)

... which will have the same stringRead definition as before, but with a validator on top.

occVersion dbOCC =   
  DF.check  
    "This form has been modified by someone else - please reload"
    (\newOCC -> show newOCC == show dbOCC)
  $ DF.stringRead "OCC Version" (Just $ dbOCC)

That will check that the most recent version of the data loaded from the database at POST time matches the version sent in the POST request.

This will now cause the form to refuse to be submitted, telling the user that there were errors.

But it doesn't tell the user where that error occurred - we need to put DB.errorList "occ" view in htmlForRegistration at the point where we want OCC errors to be reported. With individual editable fields, it makes sense to put that error list for each field's input; but there isn't a relevant visible input field for occ. I've put the error list near the top of the HTML.

So now most of the time, forms will submit just as before. If there has been a simultaneous edit, then most of the time, the error from digestive-functors will be sent back; but (hopefully only very occasionally) there will be a database level error with a not-very-helpful error returned instead.

Eliminating that last case while still using digestive-functors can probably be done, but involves a bit more interplay between explicit SQL transactions and validation: perhaps, run the whole form validation and update inside an SQL transaction and if it aborts due to a conflict, run it again, and again, and again, hoping that the second time round form validation will fail.

What about if the user is malicious and fiddles their occ value? All they are able to do is update data that they could already have updated by loading the form fresh and fiddling in there. This is a protection against user mistakes, not against users being able to change data.

Here's the link to this post's commit.

Sunday, February 25, 2018

Immutable fields (21)

There are a couple of dodgy things in the present form submission that allow things to be modified that shouldn't be.

Firstly, some fields in Registration shouldn't be modifiable via the POST interface: for example, the nonce field to identify a particular registration, or the registration status (which will modified, but only by server side code).

At present, they aren't exposed to the end user in editable HTML fields - so it would be unusual for them to be edited accidentally. They do exist as hidden HTML fields, and even if they didn't, they could be added to a POST HTTP request by someone suitably malicious and skilled at reading the source code.

I'd like to prevent those fields from being modified.

A very simple-to-suggest option is that the form data types shouldn't have these fields at all - they should be nothing to do with the user-side wire protocol at all.

But, that fits in quite awkwardly with the idea that we can use a single Registration type both to represent records in the database and fields on the web form: a row either exists everywhere or not at all.

In each iteration of form processing, we have an initial value that we're using to populate all of the form fields. In the definition of registrationDigestiveForm, "status" .: nonEmptyString attaches the status field in Registration to an over-the-wire key/value pair also labelled "status". Instead we can use pure to place a constant value here, coming from that initial Registration. This removes any connection between this field in Registration and the wire protocol - nothing sent in a POST can change this value.

Here's the updated version of registrationDigestiveForm:

registrationDigestiveForm initial = do
  Registration
    <$> "firstname" .: nonEmptyString (Just $ firstname initial)
    <*> "lastname" .: nonEmptyString (Just $ lastname initial)
    <*> "dob" .: dateLikeString (Just $ dob initial)
    <*> "swim" .: DF.bool (Just $ swim initial)
    <*> (pure . nonce) initial
    <*> "email" .: DF.optionalString (email initial)
    <*> (pure . status) initial

Here's the commit for this post. Next, I want to deal with a different kind of bad change that more plausibly will happen even with innocent users: the case of the registration being modified in two places at the same time.

Saturday, February 24, 2018

Submission status (20)

A few posts ago I added a status field to the registration types, but so far the only value used there is N for new.

I'd like to introduce some more statuses: most importantly C for completed, set when a registrant submits their form.

That's a small change: in handleRegistrationPost, instead of saving the Registration provided by digestive functors, change the status first:

    (_, Just newRegistration) -> do
      let n = newRegistration { status = "C" }
      withDB $ \conn -> gupdateInto conn "registration" "nonce = ?" n [identifier]
      return "Record updated."

Now when a registrant submits a valid form, it will be marked as completed in the database. You'll be able to see that if you look at the registration table in the database, but this changed status won't otherwise affect the behaviour of the registration system.

I'd like the registration system to prohibit changes once a form has been completed: in real life, perhaps because that information is going to be used to print admission wristbands or attendee lists. That's not to say that a registration can't ever be amended after this: but there is more happening in real life that needs to be adjusted too, and this little registration system isn't going to do that.

People coming to edit the form do so via handleRegistration in app/Main.hs and it's there that I'll put in a check for completion status.

First I'll define a predicate on status to decide if a form is editable: by default, not editable; and explicitly, editable if New and not-editable if Completed.

editableStatus :: String -> Bool
editableStatus "N" = True
editableStatus "C" = False
editableStatus _ = False

handleRegistration can split into two:

handleRegistration :: String -> S.Handler B.Html
handleRegistration identifier = do
  registration <- selectByNonce identifier

  if (editableStatus . status) registration
  then handleEditableRegistration registration
  else handleReadOnlyRegistration registration

... with handleEditableRegistration containing the remainder of the original handleRegistration, with a tweak to extract the nonce: where we previously had identifier in scope we do not, so instead extract it with nonce registration.

handleEditableRegistration :: Registration -> S.Handler B.Html
handleEditableRegistration registration = do
...
      B.h1 $ do "Registration "
                B.toHtml (show $ nonce registration)
...

... and handleReadOnlyRegistration should deliver some non-editable HTML description, for example:

handleReadOnlyRegistration :: Registration -> S.Handler B.Html
handleReadOnlyRegistration registration =
  return $ B.docTypeHtml $ do
    B.body $ do
      B.h1 $ do "Completed registration "
                B.toHtml (show $ nonce registration)
      B.p $ "First name: " <> (fromString . firstname) registration
      B.p $ "Last name: " <> (fromString . lastname) registration
      B.p $ "Date of birth: " <> (fromString . dob) registration

If there are next steps to be taken by a user, this is an appropriate place to include links (for example, in the real life version, at this point there is a downloadable PDF to print and sign - that we only want available once the form has been completed).

There's a security problem here though: although the server no longer delivers an editable form to the end user, so that they aren't encouraged to edit the form, nothing else stops a POST request being sent some other way to cause changes to the database. With friendly users, that's most likely to happen (I think) with the form being opened twice, and then submitted twice. With malicious users, it's probably not too hard to fake a submission with curl (for example) as the protocol is really simple.

So next I'm going to look at verifying updates a bit more strictly.

Friday, February 23, 2018

factor withDB out (19)

The code so far has ended up with a six or so repeated blocks that look like:

liftIO $ bracket
  (PG.connectPostgreSQL "user='postgres'")
  PG.close
  \conn -> _something_to_do_with_the_database_

More than three times violates the Rule of Three so I'm going to pull this out into a new file, src/DB.hs.

{-# Language OverloadedStrings #-}

module DB where

import Control.Exception (bracket)
import Control.Monad.IO.Class (liftIO, MonadIO)
import qualified Database.PostgreSQL.Simple as PG

withDB :: MonadIO io => (PG.Connection -> IO b) -> io b
withDB act = liftIO $ bracket
  (PG.connectPostgreSQL "user='postgres'")
  PG.close
  act

Now every occurence of bracket happens to fit into the above pattern, so every place in the code you see this:

  [registration] <- liftIO $ bracket
    (PG.connectPostgreSQL "user='postgres'")
    PG.close
    $ \conn -> do
       PGS.gselectFrom conn "registration where nonce = ?" [identifier]

... replace it with this:

  [registration] <- withDB $ \conn -> do
       PGS.gselectFrom conn "registration where nonce = ?" [identifier]

... which is three lines shorter.

We can go a bit further - this occurs three times:

withDB $ \conn -> PGS.gselectFrom conn "registration where nonce = ?" [identifier]

so replace that with a new selectByNonce:

import qualified Database.PostgreSQL.Simple.SOP as PGS
selectByNonce :: MonadIO io => String -> io [Registration]
selectByNonce identifier = withDB $ \conn -> PGS.gselectFrom conn "registration where nonce = ?" [identifier]

... or go one step further and realise that a selectByNonce is always used where exactly one row is expected to match, like this:

  [registration] <- selectByNonce identifier

and so that pattern matching can move into selectByNonce too, with better error reporting. Now the function will return exactly one Registration or otherwise throw an error (rather than returning a list and the same error manifesting as a pattern match failure).

selectByNonce :: MonadIO io => String -> io Registration
selectByNonce identifier = do
  res <- withDB $ \conn -> PGS.gselectFrom conn "registration where nonce = ?" [identifier]
  case res of
    [r] -> return r
    [] -> error $ "selectByNonce: no rows returned for " ++ identifier
    _ -> error $ "selectByNonce: multiple rows returned for " ++ identifier

... with invocations looking like this:

registration <- selectByNonce identifier

Those are just a couple of small changes but they make database access look much tidier. Specifically if the code moved to some more elaborate database connection handling (for example, a persistent connection shared between requests), then withDB could be rewritten to deal with that but the higher level code would not need to change.

Here's the commit for this post.

Thursday, February 22, 2018

Sending invitations by email (18)

Now I want to invite people by email: when a person is added to the database, they should be sent an email with a link to complete their registration details (some of which - name - will have been filled in as part of entering the invitation information).

I'd like a function quite like sendTestEmail from the previous post, but to which I can pass a registration identifier. Once that works, it can be called from the end of the invitation form submission code, and potentially from other places (for example, in the real life version of this crap web form, there is an administrator link to re-send an invitation).

The bit that actually does the delivery, sendEmail, won't change. But we'll need to generate a Mail based on the selected Registration rather than using a fairly constant value.

First, here are a load of new imports to add to src/InvitationEmail.hs: we're going to be accessing the database and generating HTML emails with blaze, plus bits to stick them together. blaze-html also needs to be added in the library dependencies in package.yaml - it's only in the main executable dependencies at the moment.

+import Control.Exception (bracket)
import Control.Monad.IO.Class (liftIO)
import Data.Maybe (fromMaybe)
import Data.Monoid ( (<>) )
import Data.String (fromString)
import qualified Data.Text.Lazy as TL
import qualified Database.PostgreSQL.Simple as PG
import qualified Database.PostgreSQL.Simple.SOP as PGS
import qualified Text.Blaze.Html5 as B
import Text.Blaze.Html5 ( (!) )
import qualified Text.Blaze.Html5.Attributes as BA
import qualified Text.Blaze.Html.Renderer.Text as BT

import Registration

First let's get the configuration, and the relevant registration record:

sendInvitationEmail identifier = do
  c <- getConfig

  [r] <- liftIO $ bracket
    (PG.connectPostgreSQL "user='postgres'")
    PG.close
    $ \conn -> PGS.gselectFrom conn "registration where nonce = ?" [identifier]

... and now prepare some of the components we'll need to generate a Mail:

  let ub = urlbase c
  let url = ub <> "/registration/" <> identifier

  let fullname = T.pack $ firstname r ++ " " ++ lastname r
  let targetEmail = T.pack $
        fromMaybe
          (error $ "No email supplied for " ++ identifier)
          (email r)
  let subject = "Registration form for " <> fullname

I'd like to produce two forms of body text: a plain text version, and an HTML version with a clickable link. The plaintext version comes from simple string concatenation; the HTML version comes from rendering some blaze-html to a text value:

  let plaintext =
          "Hello.\n"
       <> "Please complete this registration form.\n"
       <> TL.pack url

  let htmltext = BT.renderHtml $ do
        B.p "Hello"
        B.p "Please complete this registration form.\n"
        B.p $ (B.a ! BA.href (fromString url)) (fromString url)

It would be nice if this text wasn't written out twice, and if there was more text in here I'd probably make more effort to abstract away the content (with yet another markup language or perhaps just a single paragraph read from elsewhere).

Now we can prepare a Mail to send - similar to the last post's Mail construction, but plugging in the appropriate values.

  let mail = M.Mail {
      M.mailFrom = M.Address { M.addressName = Just "Registration System"
                             , M.addressEmail = T.pack $ smtpFrom c
                             }
    , M.mailTo = [M.Address { M.addressName = Just fullname
                            , M.addressEmail = targetEmail
                            }
             ]
    , M.mailCc = []
    , M.mailBcc = []
    , M.mailHeaders = [("Subject", subject)]
    , M.mailParts = [[M.plainPart plaintext, M.htmlPart htmltext]]
    }

and finally we can send it:

  sendEmail mail

With that done, find doInvitation in app/Main.hs and right before the final return, add:

  sendInvitationEmail newNonce

Now try inviting yourself. Or your friends.

I'm frustrated in this post that there are so many kinds of string-like types used: General string types include String, Text, Text.Lazy, ByteString.Lazy; and there's a specialised AttributeValue used as the parameter of !. ++ works to concatenate String, and <> works to concatenate all of them, so I've used the latter for all of the types.

url is the only place where two different types are needed from the same value: so I form it as a String and use the polymorphic fromString to turn it into whatever is needed at the point of use.

Here's the commit for this post.

Next, I've noticed that there are quite a few places where the same bracket and connectPostgreSQL code is being used to open the database; so I'll pull that out so it is only written once; and maybe also tidy up a bit of database related error handling.

Tuesday, February 20, 2018

Sending an email (17)

There doesn't seem to be a clear answer about the best way to send email in Haskell in the way that there are (several) web frameworks around. But a bit of digging in Stack Overflow suggests HaskellNet-SSL and mime-mail can be put to good use.

For this post, you're going to need an outbound mail server, which I'm expecting to need to authenticate to in order to send emails. I have my own outbound server, but I think you can probably use gmail. The details for this will be specified using the configuration mechanism implemented in the last past.

In the config file config.yaml, add your own version of this with your own email address, and mail server details.

smtpFrom: "reg@cwfh28.example.com"
smtpServer: "smtp.example.com"
smtpPort: "587"
smtpUser: "your_user_id"
smtpPassword: "yourpass"

... and add the corresponding fields in src/Config.hs:

import qualified Network.Socket as N

data Config = Config {

...

  , smtpServer :: String
  , smtpPort :: N.PortNumber
  , smtpUser :: String
  , smtpPassword :: String

...

}

You'll need to add in some dependencies to package.yaml:

    - HaskellNet
    - HaskellNet-SSL
    - bytestring
    - mime-mail
    - network
    - text

... and if you try to compile now, you'll find that there is no FromJSON instance for port number, so smtpPort can't be deserialised from YAML.

Luckily we have src/Orphans.hs as a dumping ground for orphan instances, and so we can add (with suitable imports):

instance Y.FromJSON N.PortNumber
  where parseJSON v = read <$> Y.parseJSON v

... which combines the PortNumber Read instance with the String FromJSON instance. That's why the example smtpPort configuration snippet above uses double-quotes around 587.

With this configuration in place, we can now write some code to send an email. To begin with, it will just be a test sender, rather than anything directly related to registrations.

Add an API endpoint to trigger the sending of this test mail:

type MailTestAPI = "admin" :> "mailtest" :> S.Get '[SB.HTML] B.Html

type API = 
...
      :<|> MailTestAPI


server = ... :<|> handleMailTest

handleMailTest = 

handleMailTest :: Handler B.Html
handleMailTest = do
  liftIO $ sendTestMail
  return $ B.p "mail test sent"

All that will do is give us a new API endpoint - so loading http://localhost:8080/admin/mailtest will invoke sendMailTest, which isn't written yet.

So create a new module in src/InvitationEmail.hs and import it into app/Main.hs.

This module will need these prereqs:

{-# Language OverloadedStrings #-}
module InvitationEmail where

import qualified Network.Mail.Mime as M

import Config

sendTop then consists of two parts: first, constructing a Mail containing the email to send:

sendTestMail = do
  c <- getConfig
  sendEmail $
    M.Mail {
      M.mailFrom = M.Address { M.addressName = Just "Registration System"
                             , M.addressEmail = T.pack $ smtpFrom c
                             }
    , M.mailTo = [M.Address { M.addressName = Just "You"
                            , M.addressEmail = T.pack $ smtpFrom c
                            }
             ]
    , M.mailCc = []
    , M.mailBcc = []
    , M.mailHeaders = [("Subject", "Test registration invitation for you")]
    , M.mailParts = [[M.plainPart "HELLO"]]
    }

... and the second part, sendEmail, which knows how to send the supplied email:

import qualified Data.ByteString.Lazy as BSL
import qualified Data.Text as T
import qualified Network.HaskellNet.SMTP.SSL as SSL
import qualified Network.Mail.Mime as M

...
sendEmail :: M.Mail -> IO ()
sendEmail msg = do
  config <- getConfig

  rendered <- BSL.toStrict <$> M.renderMail' msg

  let sslSettings = SSL.defaultSettingsSMTPSTARTTLS
       { SSL.sslPort = smtpPort config,
         SSL.sslDisableCertificateValidation = True
       } 

  SSL.doSMTPSTARTTLSWithSettings
    (smtpServer config)
    sslSettings
    $ \connection -> do
      succeeded  <- SSL.authenticate SSL.LOGIN
                                     (smtpUser config)
                                     (smtpPassword config)
                                     connection
      if succeeded
      then
          SSL.sendMail (addressToText (M.mailFrom msg))
                       (map addressToText (M.mailTo msg))
                       rendered connection
      else error "Could not authenticate to SMTP server"

This basically builds a server config, renders the message into a strict bytestring, then runs an SMTP session which first authenticates using your specified username/password and then sends the message (to/from the same addresses as used in the mail body - in internet email, there are at least two different To/From addresses although it's common for them to match up).

Note that SSL turns off all certificate validation, because certificate validation is a ballache to configure, and always has been. Except for when it's hard-coded into your browser. That option is an annoying double negative Disable... = True which always makes me frown.

This code is also reading the configuration file in twice. That's inefficiently but not terribly so for the expected load, and it saves threading the config through the code either explicitly or in a ReaderT.

Anyway, now you should be able to start up the web server, load http://localhost:8080/admin/mailtest, wait a few seconds for stuff to happen, and get an invitation email in your inbox.

If you're trying to use gmail outbound servers, you probably need to Allow Less Secure Apps (such as the crap webform). I had to do that, and also had to use the base of my gmail address (without @gmail.com) as the username.

Next post, I'll make that email be sent as part of creating an invitation, and customise it to have the appropriate link to click on.

Sunday, February 18, 2018

A configuration file (16)

We're going to need some configuration values for sending email, so I'll implement a very basic configuration file now.

To begin with, all it will contain is a single parameter to replace that hardcoded localhost:8080 in the source code. (trying to figure out something like that automatically is fraught with naivety and frustration)

I'll use the yaml yaml library to read in a config file formatted in yaml. For the simple key/value pair style of config file that I'm expecting this is about right. So add yaml to the library dependencies in package.yaml.

I'm going to base the configuration file around a Haskell data structure. Put this in a new file src/Config.hs:

module Config where

data Config = Config {
  urlbase :: String
}

Later on, any new configuration keys can be added into this.

I'd like to then have a function:

getConfig :: IO Config

... that returns the config.

The yaml module provides a function that reads YAML into an almost arbitrary data structure, returning either that structure or an error.

import qualified Data.Yaml as Y
...
getConfig :: IO Config
getConfig = do
  e <- Y.decodeFileEither "config.yaml"
  either
    (\err -> error $ "cannot read config file: " ++ show err)
    return
    e

decodeFileEither returns either an error (on the Left), or the loaded Config (on the Right). It knows that the output is a Config by inference from the type signature of getConfig - so in this case, the type signature isn't just pretty decoration.

If the config file can't be read, we'll error out and let someone else handle the mess. If this happens inside a servant request handler for example, it'll return a 500 error code to the remote client.

This won't compile though. decodeFileEither can't decode into any arbitrary data structure. It actually needs the destination class to be an instance of ToJSON (because YAML is a relative of JSON, and the yaml module uses that fact). Luckily it's possible to use generics for this again:

import qualified GHC.Generics as G
...
data Config = Config {
  urlbase :: String
} deriving (G.Generic, Y.FromJSON)

Now we can use this. In app/Main.hs, find doInvitation and add:

config <- getConfig

... somewhere in there. And now the code for generating the URL can change from:

let url = "http://localhost:8080/registration/" ++ newNonce

... to ...

let url = (urlbase config) ++ "/registration/" ++ newNonce

Here's an example configuration file that uses localhost still:

urlbase: "http://localhost:8080"

You can change that localhost to, for example, your PC's LAN IP address and then perhaps be able to register for events from your phone on the same LAN.

Used in this way, the config file is re-read every time doInvitation runs. For the light load I'm expecting I'm not massively fussed about this. A more Haskelly approach would be to read the configuration once at startup and thread it through the program where needed using ReaderT. That would also mean the whole program would see a consistent configuration even if the configuration file is changed; which may be a good thing or a bad thing.

Here's today's commit: f1bbe3a2. There's a config.yaml.example in there, that you'll have to copy into place. The git repo is also configured to ignore the live version of the configuration file so that it won't be committed to version control, in .gitignore. This is because that configuration is going to contain secrets in the future, and version control is a hilarious way to leak your secrets.

Next I'm going to get some basic email sending working.