TEDS Data Dictionary

Derived Variables in the 21 Year Dataset

This page gives a listing of derived variables in the 21 Year dataset, in alphabetical order of variable name. For each variable, a short written description is followed by the SPSS syntax (in a box) that was used to derive the variable.

This page does not include descriptions of background variables that are derived from other sources and that are included in the 21 Year dataset. For information about such variables, see pages describing background variables, exclusions and scrambled IDs.

Most of the twin-specific variables were derived prior to double entering the dataset. Hence the variable names used in the syntax often lack the endings (1 or 2) used for the final double entered variables.

Variable name prefixes indicate the studies from which they were derived:

u1p: TEDS21 phase 1 parent questionnaire
u1c: TEDS21 phase 1 twin questionnaire
u2c: TEDS21 phase 2 twin questionnaire
ucg: G-game twin tests
ucv1: Covid phase 1 twin questionnaire
ucv2: Covid phase 2 twin questionnaire
ucv3: Covid phase 3 twin questionnaire
ucv4: Covid phase 4 twin questionnaire

List of variables described on this page

Click on a variable name in the table below to go to the description on this page. Alternatively, scroll down and find variables in alphabetical order.

Definitions of derived variables

Listed alphabetically

u1cactvm1/2, ucv1actvm1/2, ucv2actvm1/2, ucv3actvm1/2, ucv4actvm1/2

Physical Activity scale, derived from all 3 items of the measure in the TEDS21 phase 1 twin questionnaire (u1c) and the covid phase 1 (ucv1), phase 2 (ucv2), phase 3 (ucv3) and phase 4 (ucv4) twin questionnaires.
This scale is derived as a weighted mean, with weightings 3 for item 1 (strenuous activity), 2 for item 2 (moderate activity) and 1 for item 3 (mild activity). Each item has integer response values 1-5, and the weighted mean is derived in such as way that the scale also has values from 1 to 5. All three items are required to be non-missing for this scale to be computed, because any missing item would complicate the calculation of weightings.

* Compute as a weighted mean, with weightings 3, 2 and 1 respectively.
* for items 1 (strenuous), 2 (moderate) and 3 (mild).
* To keep things simple, require all three items to be non-missing.
COMPUTE u1cactvm = (SUM.3((3 * u1cactv1), (2 * u1cactv2), u1cactv3)) / 6.
COMPUTE ucv1actvm = (SUM.3((3 * ucv1actv1), (2 * ucv1actv2), ucv1actv3)) / 6.
COMPUTE ucv2actvm = (SUM.3((3 * ucv2actv1), (2 * ucv2actv2), ucv2actv3)) / 6.
COMPUTE ucv3actvm = (SUM.3((3 * ucv3actv1), (2 * ucv3actv2), ucv3actv3)) / 6.
COMPUTE ucv4actvm = (SUM.3((3 * ucv4actv1), (2 * ucv4actv2), ucv4actv3)) / 6.
EXECUTE.

u1cage1/2, u2cage1/2, u1page

Age of twin (in decimal years) when various TEDS21 data components were completed (online) or returned (on paper):
u1page: phase 1 parent questionnaire;
u1cage1/2: phase 1 twin questionnaire;
u2cage1/2 phase 2 twin questionnaire.
Derived from item variables representing relevant dates (start dates for electronic data, or logged return dates for paper questionnaire dates). Variable aonsdob is the twin birth date. These date variables are not retained in the dataset.

* For TEDS21, we need the best estimate of date according to return method.
* For web/app users use the start dates.
IF (ANY(u1psource, 1, 2, 3)) u1pdate = u1pstart.
IF (ANY(u1csource, 1, 2, 3)) u1cdate = u1cstart.
IF (ANY(u2csource, 1, 2, 3)) u2cdate = u2cstart.
EXECUTE.
* For paper users use the return date.
IF (u1psource = 4) u1pdate = u1prdate.
IF (u1csource = 4) u1cdate = u1crdate1.
IF (u2csource = 4) u2cdate = u2crdate1.
EXECUTE.

* Now derive the ages.
COMPUTE u1page = RND((DATEDIFF(u1pdate, aonsdob, "days")) / 365.25, 0.1) .
COMPUTE u1cage = RND((DATEDIFF(u1cdate, aonsdob, "days")) / 365.25, 0.1) .
COMPUTE u2cage = RND((DATEDIFF(u2cdate, aonsdob, "days")) / 365.25, 0.1) .

u1cbaqphym1/2, u1cbaqverm1/2, u1cbaqangm1/2, u1cbaqm1/2

BAQ Aggression subscales, and an overall scale, derived from items of the measure in the phase 1 twin questionnaire. The subscales are for physical aggression (u1cbaqphym), verbal aggression (u1cbaqverm) and anger (u1cbaqangm) while the overall scale is u1cbaqm.
Each subscale is a mean of either two or three of the items, while the overall scale is a mean of all 8 items, in each case requiring at least half of the items to be non-missing. Each item has response values 1-5, hence each of these scales has the same range as it is computed as a mean.

* Physical aggression: items 1-3.
COMPUTE u1cbaqphym = MEAN.2(u1cbaq1, u1cbaq2, u1cbaq3).
* Verbal aggression: items 4-6.
COMPUTE u1cbaqverm = MEAN.2(u1cbaq4, u1cbaq5, u1cbaq6).
* Anger: items 7-8.
COMPUTE u1cbaqangm = MEAN.1(u1cbaq7, u1cbaq8).
* Total scale: all 11 items.
COMPUTE u1cbaqm = MEAN.4(u1cbaq1, u1cbaq2, u1cbaq3,
 u1cbaq4, u1cbaq5, u1cbaq6, u1cbaq7, u1cbaq8).
EXECUTE.

u1cbmi1/2

Twin BMI, measured in kilograms per square metre. Derived from twin heights and weights (item variables).

* Height variable is in centimetres.
* We want BMI in units of kilograms per square metre.
* So include a scaling factor of 10000 in the BMI calculation.
COMPUTE u1cbmi = 10000 * u1cwtkg / (u1chtcm * u1chtcm).
EXECUTE.

u1cbsaem1/2, u1cbsagm1/2

BSA Environment (u1cbsaem) and Government (u1cbsagm) scales, derived from items of the BSA measure(s) in the phase 1 twin questionnaire.
The scales are means of 6 and 5 items respectively, which are all the items included in the questionnaire. Some of the items are reversed for the Environment scale. Each item has response values 1-5, hence each of these scales has the same range as it is computed as a mean.

* Attitudes to environment: mean from all 6 items (some reversed).
COMPUTE u1cbsaem = MEAN.3(u1cbsae1, u1cbsae2r, u1cbsae3r, u1cbsae4r, u1cbsae5r, u1cbsae6r).
* Attitudes to government: mean from all 5 items.
COMPUTE u1cbsagm = MEAN.3(u1cbsag1, u1cbsag2, u1cbsag3, u1cbsag4, u1cbsag5).
EXECUTE.

u1cchaost1/2

Chaos total scale, from the twin phase 1 questionnaire, derived from all 6 available items of the Chaos measure (reversed where necessary).
Each item has integer response values 0-2, hence each scale has a range of values from 0 to (2 * number of items) because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cchaost = 6 * MEAN.3(u1cchaos1r, u1cchaos2, u1cchaos3, u1cchaos4r, u1cchaos5, u1cchaos6r).
EXECUTE.

u1ccommm1/2, ucv1commm1/2, ucv2commm1/2, ucv3commm1/2

Community scale, derived from all 5 items (reversed where necessary) of the measure in the TEDS21 phase 1 twin questionnaire (u1c) and in the covid twin phase 1 (ucv1) and phase 2 (ucv2) and phase 3 (ucv3) questionnaires. Each item has integer response values 1-5, hence the scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1ccommm = MEAN.3(u1ccomm1, u1ccomm2r, u1ccomm3, u1ccomm4r, u1ccomm5).
COMPUTE ucv1commm = MEAN.3(ucv1comm1, ucv1comm2r, ucv1comm3, ucv1comm4r, ucv1comm5).
COMPUTE ucv2commm = MEAN.3(ucv2comm1, ucv2comm2r, ucv2comm3, ucv2comm4r, ucv2comm5).
COMPUTE ucv3commm = MEAN.3(ucv3comm1, ucv3comm2r, ucv3comm3, ucv3comm4r, ucv3comm5).
EXECUTE.

u1cdadrm1/2, u1cmumrm1/2, u1ctwnrm1/2

Scales for relationships with twin (u1ctwnrm), mother (u1cmumrm) and father (u1cdadrm), derived from items for these three closely-related measures in the phase 1 twin questionnaire. Each scale is a mean of all 5 items in the respective measure, requiring at least half the items to be non-missing. In the mother and father scales, item 5 is reversed in the syntax. Every item has response values 1-5, hence each scale has the same range as it is computed as a mean.

* Twin relationships.
COMPUTE u1ctwnrm = MEAN.3(u1ctwnr1, u1ctwnr2, u1ctwnr3, u1ctwnr4, u1ctwnr5).
* Mother relationships: reverse the fifth item.
COMPUTE u1cmumrm = MEAN.3(u1cmumr1, u1cmumr2, u1cmumr3, u1cmumr4, (6 - u1cmumr5)).
* Father relationships: reverse the fifth item.
COMPUTE u1cdadrm = MEAN.3(u1cdadr1, u1cdadr2, u1cdadr3, u1cdadr4, (6 - u1cdadr5)).
EXECUTE.

u1cdevmob1/2, u1cdevwdth1/2

Device categories used for the TEDS21 twin phase 1 questionaire, if completed electronically.
u1cdevmob flags mobile devices (mobile phones and tablets), coded 1=yes 0=no.
u1cdevwdth is a device width category, coded 1=small, 2=medium, 3=large, where width refers to the widest side of the device's screen.
Derived from raw item variables, collected on the web/app server, that are not retained in the dataset. Mobile devices could be categorised from both app and web data, using different methods. Screen sizes were only recorded in the web data, not the app data.

* Mobile devices: CMS data.
* ------------------------.
* The raw CMS variable PlatformType has values 'Web' or 'Mobile'.
* so we can assume 'Mobile' value refers to mobile devices.
* while 'Web' in most cases probably means a web browser used on a laptop/desktop.
RECODE PlatformType ('Web'=0) ('Mobile'=1)
INTO u1cdevmob.

* Mobile devices: web backup.
* --------------------------.
* Use substrings of the consenttechuseragent string to categorise broad device types.
* 'Windows', 'Macintosh' and 'X11' are probably all laptops or desktops.
IF (CHAR.INDEX(consenttechuseragent, 'Windows') > 0) devicetype = 1.
IF (CHAR.INDEX(consenttechuseragent, 'Macintosh') > 0) devicetype = 2.
IF (CHAR.INDEX(consenttechuseragent, 'X11') > 0) devicetype = 3.
EXECUTE.
* 'iPhone' and 'Android' are likely to be mobile phones.
* (in rare cases of Windows phones with Android installed, this supercedes 'Windows' above).
IF (CHAR.INDEX(consenttechuseragent, 'iPhone') > 0) devicetype = 4.
IF (CHAR.INDEX(consenttechuseragent, 'Android') > 0) devicetype = 5.
EXECUTE.
* 'iPad' and 'Tablet' are presumably tablets (the latter can overrule 'Windows').
IF (CHAR.INDEX(consenttechuseragent, 'iPad') > 0) devicetype = 6.
IF (CHAR.INDEX(consenttechuseragent, 'Tablet') > 0) devicetype = 7.
EXECUTE.
* 'Mobile' is another common substring, but always indicates mobile phones.
* categorised by other substrings above.

* Recode into a binary variable to flag mobile devices.
* to be retained in the dataset (raw device variables will be dropped).
RECODE devicetype (1 THRU 3=0) (4 THRU 7=1)
INTO u1cdevmob.
EXECUTE.

* Screen width: web backup only.
* ----------------------------.
* Some devices may be used in portrait or landscape mode.
* so for consistency treat the largest dimension as the screen 'width'.
IF (consenttechscrwidth >= consenttechscrheight) screenwidth = consenttechscrwidth.
IF (consenttechscrwidth < consenttechscrheight) screenwidth = consenttechscrheight.
EXECUTE.
* Categorise screen widths arbitrarily as small, medium or large.
RECODE screenwidth
 (LOWEST THRU 767=1) (768 THRU 1199=2) (1200 THRU HIGHEST=3)
INTO u1cdevwdth.
EXECUTE.

u1ceatsbint1/2, u1ceatsbodt1/2

Eating disorder symptoms subscales.
Derived from 11 of the 12 items in the Eating Disorder Symptoms measure in the twin phase 1 questionnaire (item 11 is not included in either scale). The subscales are for binge-eating-related symptoms (u1ceatsbint) and for preoccupation with body image (u1ceatsbodt), based on 3 and 8 items respectively.
Each item has integer response values 0-5, hence each subscale has a range of values from 0 to (5 * number of component items) as it is a 'total' computed by multiplying the mean by the number of items. At least half the component items are required to be non-missing for each scale to be computed.

* Binge-eating symptoms: items 1, 5, 10.
COMPUTE u1ceatsbint = 3 * MEAN.2(u1ceats01, u1ceats05, u1ceats10).
* Preoccupation with bodily size: all other items except 11.
COMPUTE u1ceatsbodt = 8 * MEAN.4(u1ceats02, u1ceats03,
 u1ceats04, u1ceats06, u1ceats07, u1ceats08, u1ceats09, u1ceats12).
EXECUTE.

u1cedat

Standardised educational attainment composite for twins, derived from variables in the phase 1 twin SES questionnaire. The two components are u1chqualp (the probable highest level of qualification after current study, which is a derived variable described elsewhere on this page) and u1cdegr1 (the degree classification for those twins who have already graduated). Each component, and the final composite, is standardised by cohort to eliminate significant cohort differences. Coding is such that higher values indicate higher SES. The derivation is explained by comments in the syntax. Standardisation of u1cdegr1 before taking the mean effectively gives twins who are still taking degrees the same level as those who have completed degrees and achieved the mean classification. Note that u1cdegr1 (after standardisation) is given half-weighting in the mean; as a result, twins who have completed degrees with the lowest classifications remain at or above the levels of twins who have taken A-levels but not degrees, approximately preserving an appropriate rank ordering of qualifications.

* Derive twin SES purely based on educational level.
* All components show significant cohort effects.
* so start by standardising within each cohort.
SORT CASES  BY ucohort.
SPLIT FILE SEPARATE BY ucohort.

DESCRIPTIVES VARIABLES=u1chqualp u1cdegr1
  /SAVE.

SPLIT FILE OFF.

* Twin Education composite is a mean of 2 standardised components, unequally weighted.
* (1) probable educational level as derived above and (2) degree classification.
* which is given half weighting; this weighting is designed to retain a higher level.
* for those with low-classification degrees than for those with A-levels.
* Note also that degree classification is missing for over half the twins, namely.
* those without degrees and those still stuying towards degrees.
COMPUTE twineduses = MEAN(Zu1chqualp, (Zu1cdegr1 / 2)).
EXECUTE.

* Re-standardise the new composite to correct the SD to 1.
* and to ensure cohort differences are ironed out in the mean.
SORT CASES  BY ucohort.
SPLIT FILE SEPARATE BY ucohort.

DESCRIPTIVES VARIABLES= twineduses (u1cedat)
  /SAVE.

SPLIT FILE OFF.

u1cfconm1/2

Future Consequences scale, derived from all 4 items of the measure in the phase 1 twin questionnaire. Each item has integer response values 1-5, hence the scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cfconm = MEAN.2(u1cfcon1, u1cfcon2, u1cfcon3, u1cfcon4).
EXECUTE.

u1cfinam1/2

CLAS Financial Wellbeing scale, derived from all 5 items of the measure in the phase 1 twin questionnaire (reversed where necessary). Each item has integer response values 1-5, hence the scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cfinam = MEAN.3(u1cfina1, u1cfina2, u1cfina3, u1cfina4, u1cfina5r).
EXECUTE.

u1cfprdm1/2

Financial Products familiarity scale, derived from all 13 items of the measure in the phase 1 twin questionnaire. Each item has integer response values 0-4, hence the scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cfprdm = MEAN.7(u1cfprd01, u1cfprd02, u1cfprd03,
 u1cfprd04, u1cfprd05, u1cfprd06, u1cfprd07, u1cfprd08, u1cfprd09, 
 u1cfprd10, u1cfprd11, u1cfprd12, u1cfprd13).
EXECUTE.

u1cgoalfult1/2, u1cgoalrelt1/2, ucv1goalfult1/2, ucv1goalrelt1/2, ucv2goalfult1/2, ucv2goalrelt1/2, ucv3goalfult1/2, ucv3goalrelt1/2, ucv4goalfult1/2, ucv4goalrelt1/2

Goals subscales, from the twin TEDS21 phase 1 questionnaire (u1c) and from the twin covid phase 1 (ucv1), phase 2 (ucv2), phase 3 (ucv3) and phase 4 (ucv4) questionnaires. The measure was the same in all questionnaires. The subscales are derived from 5 and 4 items respectively. Each item has integer response values 1-5, hence each scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

* Relationships subscale (5 items).
COMPUTE u1cgoalrelt = 5 * MEAN.3(u1cgoal1, u1cgoal3, u1cgoal4, u1cgoal5, u1cgoal8).
COMPUTE ucv1goalrelt = 5 * MEAN.3(ucv1goal1, ucv1goal3, ucv1goal4, ucv1goal5, ucv1goal8).
COMPUTE ucv2goalrelt = 5 * MEAN.3(ucv2goal1, ucv2goal3, ucv2goal4, ucv2goal5, ucv2goal8).
COMPUTE ucv3goalrelt = 5 * MEAN.3(ucv3goal1, ucv3goal3, ucv3goal4, ucv3goal5, ucv3goal8).
COMPUTE ucv4goalrelt = 5 * MEAN.3(ucv4goal1, ucv4goal3, ucv4goal4, ucv4goal5, ucv4goal8).
* Fulfilment subscale (4 items).
COMPUTE u1cgoalfult = 4 * MEAN.2(u1cgoal2, u1cgoal6, u1cgoal7, u1cgoal9).
COMPUTE ucv1goalfult = 4 * MEAN.2(ucv1goal2, ucv1goal6, ucv1goal7, ucv1goal9).
COMPUTE ucv2goalfult = 4 * MEAN.2(ucv2goal2, ucv2goal6, ucv2goal7, ucv2goal9).
COMPUTE ucv3goalfult = 4 * MEAN.2(ucv3goal2, ucv3goal6, ucv3goal7, ucv3goal9).
COMPUTE ucv4goalfult = 4 * MEAN.2(ucv4goal2, ucv4goal6, ucv4goal7, ucv4goal9).
EXECUTE.

u1chqualp1/2

Probable highest level of educational qualification for twins, once current studies (if any) have been completed. Based on two variables from the phase 1 twin questionnaire: (1) u1chqual, the current highest qualification level; and (2) u1ccqual, the qualification level towards which twins are currently studying (if applicable). Both items are ordinal with the same 1-11 coding, and the derived variable has the same coding. Comments in the syntax below explain the derivation.

* Enhance the 'highest educational qualification level' variable.
* to get a better measure of the 'probable' highest educational level.
* for those who are still studying.

* Default value is current highest level.
COMPUTE u1chqualp = u1chqual.
EXECUTE.
* If missing (generally because twin specified 'other' or 'overseas' qualifications).
* and if currently studying, substitute the value of u1ccqual if present.
IF (SYSMIS(u1chqual) & ~SYSMIS(u1ccqual)) u1chqualp = u1ccqual.
EXECUTE.
* If currently studying towards a higher level than is currently held.
* (if u1ccqual > u1chqual) then use the higher level.
* There are a few apparently unrealistic jumps in level here, but they are very few.
* in number and it's simplest to take both responses at face value.
IF (u1ccqual > u1chqual) u1chqualp = u1ccqual.
EXECUTE.

u1cLLCage1/2, u1cLLCdate1/2, u1pLLCage1/2, u1pLLCdate1/2, u2cLLCage1/2, u2cLLCdate1/2, ucgLLCage1/2, ucgLLCdate1/2, ucv1LLCage1/2, ucv1LLCdate1/2, ucv2LLCage1/2, ucv2LLCdate1/2, ucv3LLCage1/2, ucv3LLCdate1/2, ucv4LLCage1/2, ucv4LLCdate1/2

Age and date variables derived for use in datasets in the LLC TRE (but not to be used in other datasets).
Ages and dates are derived for TEDS21 phase 1 parent ('u1p') and twin ('u1c') and phase 2 twin ('u2c'); the g-game tests ('ucg'), and covid questionnaire phases 1, 2, 3 and 4 ('ucv1', 'ucv2', 'ucv3', 'ucv4' respectively).
The LLC date variables contain only the month and year, not the day, as a means of reducing identifiability. The date variables are strings formatted as 'yyyy-mm'. These LLC dates are designed to enable the TEDS measures to be placed in a time sequence with NHS medical diagnosis dates in the data in the TRE.
The LLC age variables are integers measuring the number of months between birth and the given TEDS activity, consistent with the matching LLC date variables.
Variable aonsdob is the twin birth date - the raw date variables are not retained in the dataset.

* For TEDS21, we first need the best estimate of date according to return method.
NUMERIC u1pdate u1cdate u2cdate (EDATE11).
* For web/app users use the start dates.
IF (ANY(u1psource, 1, 2, 3)) u1pdate = u1pstart.
IF (ANY(u1csource, 1, 2, 3)) u1cdate = u1cstart.
IF (ANY(u2csource, 1, 2, 3)) u2cdate = u2cstart.
EXECUTE.
* For paper users use the return date.
IF (u1psource = 4) u1pdate = u1prdate.
IF (u1csource = 4) u1cdate = u1crdate1.
IF (u2csource = 4) u2cdate = u2crdate1.
EXECUTE.

* Now extract year and month as temp variables, from birth date and activity dates.
COMPUTE birthyear = XDATE.YEAR(aonsdob).
COMPUTE birthmonth = XDATE.MONTH(aonsdob).
COMPUTE u1pyear = XDATE.YEAR(u1pdate).
COMPUTE u1pmonth = XDATE.MONTH(u1pdate).
COMPUTE u1cyear = XDATE.YEAR(u1cdate).
COMPUTE u1cmonth = XDATE.MONTH(u1cdate).
COMPUTE u2cyear = XDATE.YEAR(u2cdate).
COMPUTE u2cmonth = XDATE.MONTH(u2cdate).
COMPUTE ucgyear = XDATE.YEAR(ucgconstdt).
COMPUTE ucgmonth = XDATE.MONTH(ucgconstdt).
COMPUTE ucv1year = XDATE.YEAR(ucv1constdt).
COMPUTE ucv1month = XDATE.MONTH(ucv1constdt).
COMPUTE ucv2year = XDATE.YEAR(ucv2constdt).
COMPUTE ucv2month = XDATE.MONTH(ucv2constdt).
COMPUTE ucv3year = XDATE.YEAR(ucv3constdt).
COMPUTE ucv3month = XDATE.MONTH(ucv3constdt).
COMPUTE ucv4year = XDATE.YEAR(ucv4constdt).
COMPUTE ucv4month = XDATE.MONTH(ucv4constdt).
EXECUTE.

* The agreed LLC date format is a string yyyy-mm (nominal by default for strings).
* adding '0' where necessary for two-digit months.
STRING u1pLLCdate u1cLLCdate u2cLLCdate ucgLLCdate ucv1LLCdate ucv2LLCdate ucv3LLCdate ucv4LLCdate (A7).
IF (u1pmonth < 10) u1pLLCdate = CONCAT(STRING(u1pyear, F4), '-0', STRING(u1pmonth, F1)).
IF (u1pmonth >= 10) u1pLLCdate = CONCAT(STRING(u1pyear, F4), '-', STRING(u1pmonth, F2)).
IF (u1cmonth < 10) u1cLLCdate = CONCAT(STRING(u1cyear, F4), '-0', STRING(u1cmonth, F1)).
IF (u1cmonth >= 10) u1cLLCdate = CONCAT(STRING(u1cyear, F4), '-', STRING(u1cmonth, F2)).
IF (u2cmonth < 10) u2cLLCdate = CONCAT(STRING(u2cyear, F4), '-0', STRING(u2cmonth, F1)).
IF (u2cmonth >= 10) u2cLLCdate = CONCAT(STRING(u2cyear, F4), '-', STRING(u2cmonth, F2)).
IF (ucgmonth < 10) ucgLLCdate = CONCAT(STRING(ucgyear, F4), '-0', STRING(ucgmonth, F1)).
IF (ucgmonth >= 10) ucgLLCdate = CONCAT(STRING(ucgyear, F4), '-', STRING(ucgmonth, F2)).
IF (ucv1month < 10) ucv1LLCdate = CONCAT(STRING(ucv1year, F4), '-0', STRING(ucv1month, F1)).
IF (ucv1month >= 10) ucv1LLCdate = CONCAT(STRING(ucv1year, F4), '-', STRING(ucv1month, F2)).
IF (ucv2month < 10) ucv2LLCdate = CONCAT(STRING(ucv2year, F4), '-0', STRING(ucv2month, F1)).
IF (ucv2month >= 10) ucv2LLCdate = CONCAT(STRING(ucv2year, F4), '-', STRING(ucv2month, F2)).
IF (ucv3month < 10) ucv3LLCdate = CONCAT(STRING(ucv3year, F4), '-0', STRING(ucv3month, F1)).
IF (ucv3month >= 10) ucv3LLCdate = CONCAT(STRING(ucv3year, F4), '-', STRING(ucv3month, F2)).
IF (ucv4month < 10) ucv4LLCdate = CONCAT(STRING(ucv4year, F4), '-0', STRING(ucv4month, F1)).
IF (ucv4month >= 10) ucv4LLCdate = CONCAT(STRING(ucv4year, F4), '-', STRING(ucv4month, F2)).
EXECUTE.

* The agreed LLC age variable is in integer months.
* and it must agree with the birth and booklet year/month variables that will be available in the LLC.
COMPUTE u1pLLCage = (u1pmonth + (u1pyear * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE u1cLLCage = (u1cmonth + (u1cyear * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE u2cLLCage = (u2cmonth + (u2cyear * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE ucgLLCage = (ucgmonth + (ucgyear * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE ucv1LLCage = (ucv1month + (ucv1year * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE ucv2LLCage = (ucv2month + (ucv2year * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE ucv3LLCage = (ucv3month + (ucv3year * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE ucv4LLCage = (ucv4month + (ucv4year * 12)) - (birthmonth + (birthyear * 12)).
EXECUTE.

u1cmarrhopm1/2, u1cmarrworm1/2, u1cmarrm1/2

Marriage Attitudes subscales, and an overall scale, derived from items of the measure in the phase 1 twin questionnaire.
The subscales are for hopeful attitudes (u1cmarrhopm), and worries (u1cmarrworm) while the overall scale is u1cmarrm.
Each subscale is derived from 4 items, while the overall scale is a mean of all 8 items (with the 'worries' items reversed). For each scale, at least half the component items are required to be non-missing. Each item has response values 1-5, hence each of these scales has the same range as it is computed as a mean.

* Marriage hope scale: items 1,3,6,7.
COMPUTE u1cmarrhopm = MEAN.2(u1cmarr1, u1cmarr3, u1cmarr6, u1cmarr7).
* Marriage worry scale: items 2,4,5,8.
COMPUTE u1cmarrworm = MEAN.2(u1cmarr2, u1cmarr4, u1cmarr5, u1cmarr8).
* Overall scale: all 8 items, with items 2/4/5/8 reversed.
COMPUTE u1cmarrm = MEAN.4(u1cmarr1, u1cmarr2r, u1cmarr3,
 u1cmarr4r, u1cmarr5r, u1cmarr6, u1cmarr7, u1cmarr8r).
EXECUTE.

u1cmeduphot1/2, u1cmeduvidt1/2, u1cmedusoct1/2

Media use subscales.
Derived from the 11 items of this measure in the twin phase 1 questionnaire. The subscales are for mobile phone use (u1cmeduphot), use of video (u1cmeduvidt) and for social media use (u1cmedusoct), based on 5, 2 and 4 items respectively.
Each item has integer response values 0-5, hence each subscale has a range of values from 0 to (5 * number of component items) as it is a 'total' computed by multiplying the mean by the number of items. At least half the component items are required to be non-missing for each scale to be computed.

* Phone use total: items 1-5.
COMPUTE u1cmeduphot = 5 * MEAN.3(u1cmedu01, u1cmedu02, u1cmedu03, u1cmedu04, u1cmedu05).
* Video total: items 6-7.
COMPUTE u1cmeduvidt = 2 * MEAN.2(u1cmedu06, u1cmedu07).
* Social media total: items 8-11.
COMPUTE u1cmedusoct = 4 * MEAN.2(u1cmedu08, u1cmedu09, u1cmedu10, u1cmedu11).
EXECUTE.

u1cmfqt1/2, u2cmfqt1/2, ucv1mfqt1/2, ucv2mfqt1/2, ucv3mfqt1/2, ucv4mfqt1/2

SMFQ total scale, from the TEDS21 twin phase 1 (u1c), TEDS21 twin phase 2 (u2c) and covid twin phase 1 (ucv1), phase 2 (ucv2), phase 3 (ucv3) and phase 4 (ucv4) questionnaires. Derived from all 8 available items of the SMFQ measure (identical sets of items were used in all these questionnaires).
Each item has integer response values 0-2, hence each scale has a range of values from 0 to (2 * number of items) because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cmfqt = 8 * MEAN.4(u1cmfq1, u1cmfq2,
 u1cmfq3, u1cmfq4, u1cmfq5, u1cmfq6, u1cmfq7, u1cmfq8).
COMPUTE u2cmfqt = 8 * MEAN.4(u2cmfq1, u2cmfq2,
 u2cmfq3, u2cmfq4, u2cmfq5, u2cmfq6, u2cmfq7, u2cmfq8).
COMPUTE ucv1mfqt = 8 * MEAN.4(ucv1mfq1, ucv1mfq2,
 ucv1mfq3, ucv1mfq4, ucv1mfq5, ucv1mfq6, ucv1mfq7, ucv1mfq8).
COMPUTE ucv2mfqt = 8 * MEAN.4(ucv2mfq1, ucv2mfq2,
 ucv2mfq3, ucv2mfq4, ucv2mfq5, ucv2mfq6, ucv2mfq7, ucv2mfq8).
COMPUTE ucv3mfqt = 8 * MEAN.4(ucv3mfq1, ucv3mfq2,
 ucv3mfq3, ucv3mfq4, ucv3mfq5, ucv3mfq6, ucv3mfq7, ucv3mfq8).
COMPUTE ucv4mfqt = 8 * MEAN.4(ucv4mfq1, ucv4mfq2,
 ucv4mfq3, ucv4mfq4, ucv4mfq5, ucv4mfq6, ucv4mfq7, ucv4mfq8).
EXECUTE.

u1cmonam1/2

Money Attitudes scale, derived from all 6 items of the measure in the phase 1 twin questionnaire (reversed where necessary). Each item has integer response values 1-5, hence the scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cmonam = MEAN.3(u1cmona1r, u1cmona2r, u1cmona3, u1cmona4, u1cmona5, u1cmona6r).
EXECUTE.

u1cmumrm1/2

See u1cdadrm1/2, etc above.

u1cobult1/2

Online bullying total scale, from the twin phase 1 questionnaire, derived from all 4 available items of the measure.
Each item has integer response values 0-2, hence the scale has a range of values from 0 to 8 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cobult = 4 * MEAN.2(u1cobul1, u1cobul2, u1cobul3, u1cobul4).
EXECUTE.

u1cparvm1/2, ucv1parvm1/2, ucv2parvm1/2, ucv3parvm1/2, ucv4parvm1/2

Partner Violence scale, derived from all 6 items of the measure in the TEDS21 phase 1 twin questionnaire (u1c) and in the covid twin phase 1 (ucv1), phase 2 (ucv2), phase 3 (ucv3) and phase 4 (ucv4) questionnaires. Each item has integer response values 1-5, hence the scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cparvm = MEAN.3(u1cparv1, u1cparv2, u1cparv3, u1cparv4, u1cparv5, u1cparv6).
COMPUTE ucv1parvm = MEAN.3(ucv1parv1, ucv1parv2, ucv1parv3, ucv1parv4, ucv1parv5, ucv1parv6).
COMPUTE ucv2parvm = MEAN.3(ucv2parv1, ucv2parv2, ucv2parv3, ucv2parv4, ucv2parv5, ucv2parv6).
COMPUTE ucv3parvm = MEAN.3(ucv3parv1, ucv3parv2, ucv3parv3, ucv3parv4, ucv3parv5, ucv3parv6).
COMPUTE ucv4parvm = MEAN.3(ucv4parv1, ucv4parv2, ucv4parv3, ucv4parv4, ucv4parv5, ucv4parv6).
EXECUTE.

u1cpeerprem1/2, u1cpeerrism1/2, u1cpeerm1/2

Peer Pressure subscales, and an overall scale, derived from items of the measure in the phase 1 twin questionnaire.
The subscales are for submission to peer pressure (u1cpeerprem), and engagement in risky activities (u1cpeerrism) while the overall scale is u1cpeerm.
The subscales are derived from 3 and 4 items respectively, while the overall scale is a mean of all 7 items, in each case requiring at least half of the items to be non-missing. Each item has response values 1-5, hence each of these scales has the same range as it is computed as a mean.

* Subscale: submission to peer pressure (3 items).
COMPUTE u1cpeerprem = MEAN.2(u1cpeer1, u1cpeer2, u1cpeer4).
* Subscale: risky activities (4 items).
COMPUTE u1cpeerrism = MEAN.2(u1cpeer3, u1cpeer5, u1cpeer6, u1cpeer7).
* Overall: 7 items.
COMPUTE u1cpeerm = MEAN.4(u1cpeer1, u1cpeer2, u1cpeer3, u1cpeer4, u1cpeer5, u1cpeer6, u1cpeer7).
EXECUTE.

u1cpersneum1/2, u1cpersextm1/2, u1cpersopem1/2, u1cpersagrm1/2, u1cpersconm1/2

Personality subscales, derived from items of the Big 5 Personality measure in the phase 1 twin questionnaire. Each measure is a mean of 6 of the items, requiring at least 3 of them to be non-missing. Each item has response values 1-5, hence each scale has the same range as it is computed as a mean.

* Big 5 personality.
* Self-rated: 5 subscales, each based on 6 items.
* Neuroticism.
COMPUTE u1cpersneum = MEAN.3(u1cpers01, u1cpers02, u1cpers03, u1cpers04, u1cpers05, u1cpers06).
* Extraversion.
COMPUTE u1cpersextm = MEAN.3(u1cpers07, u1cpers08, u1cpers09, u1cpers10, u1cpers11, u1cpers12).
* Openness.
COMPUTE u1cpersopem = MEAN.3(u1cpers13, u1cpers14, u1cpers15, u1cpers16, u1cpers17, u1cpers18).
* Agreeableness.
COMPUTE u1cpersagrm = MEAN.3(u1cpers19, u1cpers20, u1cpers21, u1cpers22, u1cpers23, u1cpers24).
* Conscientiousness.
COMPUTE u1cpersconm = MEAN.3(u1cpers25, u1cpers26, u1cpers27, u1cpers28, u1cpers29, u1cpers30).
EXECUTE.

u1cpilm1/2, ucv1pilm1/2, ucv2pilm1/2, ucv3pilm1/2, ucv4pilm1/2

Purpose in Life scale, derived from all 5 items of the measure in the phase 1 twin questionnaire and in the covid questionnaires (the same measure was used in each case). Each item has integer response values 1-5, hence the scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cpilm = MEAN.3(u1cpil1, u1cpil2, u1cpil3, u1cpil4, u1cpil5).
COMPUTE ucv1pilm = MEAN.3(ucv1pil1, ucv1pil2, ucv1pil3, ucv1pil4, ucv1pil5).
COMPUTE ucv2pilm = MEAN.3(ucv2pil1, ucv2pil2, ucv2pil3, ucv2pil4, ucv2pil5).
COMPUTE ucv3pilm = MEAN.3(ucv3pil1, ucv3pil2, ucv3pil3, ucv3pil4, ucv3pil5).
COMPUTE ucv4pilm = MEAN.3(ucv4pil1, ucv4pil2, ucv4pil3, ucv4pil4, ucv4pil5).
EXECUTE.

u1cprobt1/2

Problematic internet use total scale, from the twin phase 1 questionnaire, derived from all 6 available items of the measure.
Each item has integer response values 0-4, hence the scale has a range of values from 0 to 24 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cprobt = 6 * MEAN.3(u1cprob1, u1cprob2, u1cprob3, u1cprob4, u1cprob5, u1cprob6).
EXECUTE.

u1crandm1/2

RAND general health scale, derived from all 5 items of the measure in the phase 1 twin questionnaire. Items 3 and 5 are reversed for the scale. Each item has integer response values 1-5, hence the scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1crandm = MEAN.3(u1crand1, u1crand2, u1crand3r, u1crand4, u1crand5r).
EXECUTE.

u1creaphlt1/2, u1creapunt1/2, u1creapt1/2

REAP eating habits scales: two subscales and a total scale.
Derived from items of the REAP measure in the twin phase 1 questionnaire.
The subscales are for eating healthy foods (u1creaphlt) and for eating unhealthy foods (u1creapunt), based on 6 items each. The total scale (u1creapt) is derived from all 12 items of the measure.
Each item has integer response values 0-4, hence each scale has a range of values from 0 to (4 * number of component items) as it is a 'total' computed by multiplying the mean by the number of items. At least half the component items are required to be non-missing for each scale to be computed.

* Eating healthy items (1,2,3,5,8,9).
COMPUTE u1creaphlt = 6 * MEAN.3(u1creap01, u1creap02, u1creap03, u1creap05, u1creap08, u1creap09).
* Unhealthy items (4,6,7,10,11,12).
COMPUTE u1creapunt = 6 * MEAN.3(u1creap04, u1creap06, u1creap07, u1creap10, u1creap11, u1creap12).
* Total: all items, using reversed versions for unhealthy items.
COMPUTE u1creapt = 12 * MEAN.6(u1creap01, u1creap02, u1creap03,
 u1creap04r, u1creap05, u1creap06r, u1creap07r, u1creap08, u1creap09,
 u1creap10r, u1creap11r, u1creap12r).
EXECUTE.

u1crelam1/2, ucv1relam1/2, ucv2relam1/2, ucv3relam1/2, ucv4relam1/2

CLAS Love and Relationships scale, derived from items 4-6 of the measure in the TEDS21 phase 1 twin (u1c) questionnaire (items 1-3 are not suitable for scaling); and derived from the same three items that were repeated in the covid phase 1 (ucv1), phase 2 (ucv2), phase 3 (ucv3) and phase 4 (ucv4) twin questionnaires. Each item has integer response values 1-5, hence the scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1crelam = MEAN.2(u1crela4, u1crela5, u1crela6).
COMPUTE ucv1relam = MEAN.2(ucv1rela1, ucv1rela2, ucv1rela3).
COMPUTE ucv2relam = MEAN.2(ucv2rela1, ucv2rela2, ucv2rela3).
COMPUTE ucv3relam = MEAN.2(ucv3rela1, ucv3rela2, ucv3rela3).
COMPUTE ucv4relam = MEAN.2(ucv4rela1, ucv4rela2, ucv4rela3).
EXECUTE.

u1crelgt1/2

Religiosity total scale, from the twin phase 1 questionnaire, derived from all 5 available items of the measure.
Items 1-4 have integer response values 0-5. Item 5 has response values 1-5, and for scaling purposes these are adjusted to the 0-5 range to match the other items. Hence the total scale has a range of values from 0 to 25 because it is computed as a mean of 0-5 variables then multiplied by 5. At least half the items are required to be non-missing for the scale to be computed.

* Items 1-4 have 6 responses coded 0-5, while item 5 has 5 responses coded 1-5.
* To make scale, re-scale item 5 to values 0-5 as for the other items.
*  by subtracting 1 then multiplying by 5/4.
NUMERIC u1crelgt (F4.2).
VARIABLE LEVEL u1crelgt (SCALE).
COMPUTE u1crelgt = 5 * MEAN.3(u1crelg1, u1crelg2, u1crelg3, u1crelg4, ((u1crelg5 - 1) * 1.25)).
EXECUTE.

u1crskt1/2, u1prskt1/2

Risk Taking total scales, from the same measure in the twin phase 1 (u1crsk) and parent phase 1 (u1prsk) questionnaires. In each case, the scale is derived from all 6 item variables. Each item has integer response values 0-4, hence each scale has a range of values from 0 to 24. At least half the items are required to be non-missing for each scale to be computed.

COMPUTE u1crskt = 6 * MEAN.3(u1crsk1, u1crsk2, u1crsk3, u1crsk4, u1crsk5, u1crsk6).
COMPUTE u1prskt1 = 6 * MEAN.3(u1prsk11, u1prsk21, u1prsk31, u1prsk41, u1prsk51, u1prsk61).
EXECUTE.

u1csdqemot1/2, u1csdqpert1/2, u1csdqhypt1/2, u1csdqcont1/2, u1csdqprot1/2, u1csdqbeht1/2, u1psdqemot1/2, u1psdqpert1/2, u1psdqhypt1/2, u1psdqcont1/2, u1psdqprot1/2, u1psdqbeht1/2, ucv1sdqemot1/2, ucv1sdqpert1/2, ucv1sdqhypt1/2, ucv1sdqcont1/2, ucv1sdqprot1/2, ucv1sdqbeht1/2, ucv2sdqemot1/2, ucv2sdqpert1/2, ucv2sdqhypt1/2, ucv2sdqcont1/2, ucv2sdqprot1/2, ucv2sdqbeht1/2, ucv3sdqemot1/2, ucv3sdqpert1/2, ucv3sdqhypt1/2, ucv3sdqcont1/2, ucv3sdqprot1/2, ucv3sdqbeht1/2, ucv4sdqemot1/2, ucv4sdqpert1/2, ucv4sdqhypt1/2, ucv4sdqcont1/2, ucv4sdqprot1/2, ucv4sdqbeht1/2

SDQ scales, from the TEDS21 phase 1 twin (u1c) and parent (u1p) questionnaires and from the covid twin phase 1 (ucv1), phase 2 (ucv2), phase 3 (ucv3) and phase 4 (ucv4) questionnaires. The items of the measure were the same in all questionnaires, except that the wording was modified for the parent-reported version.
Total behaviour problems scale plus five subscales, derived from items (reversed where necessary) of the SDQ measure.
Each item has integer response values 0-2, hence each scale has a range of values from 0 to (2 * number of items) because it is computed as the mean multiplied by the number of component items. Require at least half the component items to be non-missing for each scale.
Note that the SDQ Emotion scale was previously referred to as Anxiety.

* Emotional symptoms (previously called Anxiety).
COMPUTE u1psdqemot1 = 5 * MEAN.3(u1psdqemo11, u1psdqemo21, u1psdqemo31, u1psdqemo41, u1psdqemo51).
COMPUTE u1csdqemot = 5 * MEAN.3(u1csdqemo1, u1csdqemo2, u1csdqemo3, u1csdqemo4, u1csdqemo5).
COMPUTE ucv1sdqemot = 5 * MEAN.3(ucv1sdqemo1, ucv1sdqemo2, ucv1sdqemo3, ucv1sdqemo4, ucv1sdqemo5).
COMPUTE ucv2sdqemot = 5 * MEAN.3(ucv2sdqemo1, ucv2sdqemo2, ucv2sdqemo3, ucv2sdqemo4, ucv2sdqemo5).
COMPUTE ucv3sdqemot = 5 * MEAN.3(ucv3sdqemo1, ucv3sdqemo2, ucv3sdqemo3, ucv3sdqemo4, ucv3sdqemo5).
COMPUTE ucv4sdqemot = 5 * MEAN.3(ucv4sdqemo1, ucv4sdqemo2, ucv4sdqemo3, ucv4sdqemo4, ucv4sdqemo5).
EXECUTE.
* Peer problems.
COMPUTE u1psdqpert1 = 5 * MEAN.3(u1psdqper11, u1psdqper2r1, u1psdqper3r1, u1psdqper41, u1psdqper51).
COMPUTE u1csdqpert = 5 * MEAN.3(u1csdqper1, u1csdqper2r, u1csdqper3r, u1csdqper4, u1csdqper5).
COMPUTE ucv1sdqpert = 5 * MEAN.3(ucv1sdqper1, ucv1sdqper2r, ucv1sdqper3r, ucv1sdqper4, ucv1sdqper5).
COMPUTE ucv2sdqpert = 5 * MEAN.3(ucv2sdqper1, ucv2sdqper2r, ucv2sdqper3r, ucv2sdqper4, ucv2sdqper5).
COMPUTE ucv3sdqpert = 5 * MEAN.3(ucv3sdqper1, ucv3sdqper2r, ucv3sdqper3r, ucv3sdqper4, ucv3sdqper5).
COMPUTE ucv4sdqpert = 5 * MEAN.3(ucv4sdqper1, ucv4sdqper2r, ucv4sdqper3r, ucv4sdqper4, ucv4sdqper5).
EXECUTE.
* Hyperactivity.
COMPUTE u1psdqhypt1 = 5 * MEAN.3(u1psdqhyp11, u1psdqhyp21, u1psdqhyp31, u1psdqhyp4r1, u1psdqhyp5r1).
COMPUTE u1csdqhypt = 5 * MEAN.3(u1csdqhyp1, u1csdqhyp2, u1csdqhyp3, u1csdqhyp4r, u1csdqhyp5r).
COMPUTE ucv1sdqhypt = 5 * MEAN.3(ucv1sdqhyp1, ucv1sdqhyp2, ucv1sdqhyp3, ucv1sdqhyp4r, ucv1sdqhyp5r).
COMPUTE ucv2sdqhypt = 5 * MEAN.3(ucv2sdqhyp1, ucv2sdqhyp2, ucv2sdqhyp3, ucv2sdqhyp4r, ucv2sdqhyp5r).
COMPUTE ucv3sdqhypt = 5 * MEAN.3(ucv3sdqhyp1, ucv3sdqhyp2, ucv3sdqhyp3, ucv3sdqhyp4r, ucv3sdqhyp5r).
COMPUTE ucv4sdqhypt = 5 * MEAN.3(ucv4sdqhyp1, ucv4sdqhyp2, ucv4sdqhyp3, ucv4sdqhyp4r, ucv4sdqhyp5r).
EXECUTE.
* Conduct.
COMPUTE u1psdqcont1 = 5 * MEAN.3(u1psdqcon11, u1psdqcon2r1, u1psdqcon31, u1psdqcon41, u1psdqcon51).
COMPUTE u1csdqcont = 5 * MEAN.3(u1csdqcon1, u1csdqcon2r, u1csdqcon3, u1csdqcon4, u1csdqcon5).
COMPUTE ucv1sdqcont = 5 * MEAN.3(ucv1sdqcon1, ucv1sdqcon2r, ucv1sdqcon3, ucv1sdqcon4, ucv1sdqcon5).
COMPUTE ucv2sdqcont = 5 * MEAN.3(ucv2sdqcon1, ucv2sdqcon2r, ucv2sdqcon3, ucv2sdqcon4, ucv2sdqcon5).
COMPUTE ucv3sdqcont = 5 * MEAN.3(ucv3sdqcon1, ucv3sdqcon2r, ucv3sdqcon3, ucv3sdqcon4, ucv3sdqcon5).
COMPUTE ucv4sdqcont = 5 * MEAN.3(ucv4sdqcon1, ucv4sdqcon2r, ucv4sdqcon3, ucv4sdqcon4, ucv4sdqcon5).
EXECUTE.
* Prosocial.
COMPUTE u1psdqprot1 = 5 * MEAN.3(u1psdqpro11, u1psdqpro21, u1psdqpro31, u1psdqpro41, u1psdqpro51).
COMPUTE u1csdqprot = 5 * MEAN.3(u1csdqpro1, u1csdqpro2, u1csdqpro3, u1csdqpro4, u1csdqpro5).
COMPUTE ucv1sdqprot = 5 * MEAN.3(ucv1sdqpro1, ucv1sdqpro2, ucv1sdqpro3, ucv1sdqpro4, ucv1sdqpro5).
COMPUTE ucv2sdqprot = 5 * MEAN.3(ucv2sdqpro1, ucv2sdqpro2, ucv2sdqpro3, ucv2sdqpro4, ucv2sdqpro5).
COMPUTE ucv3sdqprot = 5 * MEAN.3(ucv3sdqpro1, ucv3sdqpro2, ucv3sdqpro3, ucv3sdqpro4, ucv3sdqpro5).
COMPUTE ucv4sdqprot = 5 * MEAN.3(ucv4sdqpro1, ucv4sdqpro2, ucv4sdqpro3, ucv4sdqpro4, ucv4sdqpro5).
EXECUTE.
* Behaviour problems total - all items except prosocial.
COMPUTE u1psdqbeht1 = 20 * MEAN.10(u1psdqhyp11, u1psdqemo11,u1psdqcon11, u1psdqper11,
 u1psdqcon2r1, u1psdqemo21, u1psdqhyp21, u1psdqper2r1, u1psdqcon31, u1psdqemo31, u1psdqper3r1, 
 u1psdqhyp31, u1psdqemo41, u1psdqcon41, u1psdqper41, u1psdqhyp4r1, u1psdqcon51, u1psdqper51, 
 u1psdqemo51, u1psdqhyp5r1).
COMPUTE u1csdqbeht = 20 * MEAN.10(u1csdqhyp1, u1csdqemo1, u1csdqcon1, u1csdqper1, 
 u1csdqcon2r, u1csdqemo2, u1csdqhyp2, u1csdqper2r, u1csdqcon3, u1csdqemo3, u1csdqper3r, 
 u1csdqhyp3, u1csdqemo4, u1csdqcon4, u1csdqper4, u1csdqhyp4r, u1csdqcon5, u1csdqper5,
 u1csdqemo5, u1csdqhyp5r).
COMPUTE ucv1sdqbeht = 20 * MEAN.10(ucv1sdqhyp1, ucv1sdqemo1, ucv1sdqcon1, ucv1sdqper1, 
 ucv1sdqcon2r, ucv1sdqemo2, ucv1sdqhyp2, ucv1sdqper2r, ucv1sdqcon3, ucv1sdqemo3, ucv1sdqper3r, 
 ucv1sdqhyp3, ucv1sdqemo4, ucv1sdqcon4, ucv1sdqper4, ucv1sdqhyp4r, ucv1sdqcon5, ucv1sdqper5,
 ucv1sdqemo5, ucv1sdqhyp5r).
COMPUTE ucv2sdqbeht = 20 * MEAN.10(ucv2sdqhyp1, ucv2sdqemo1, ucv2sdqcon1, ucv2sdqper1, 
 ucv2sdqcon2r, ucv2sdqemo2, ucv2sdqhyp2, ucv2sdqper2r, ucv2sdqcon3, ucv2sdqemo3, ucv2sdqper3r, 
 ucv2sdqhyp3, ucv2sdqemo4, ucv2sdqcon4, ucv2sdqper4, ucv2sdqhyp4r, ucv2sdqcon5, ucv2sdqper5,
 ucv2sdqemo5, ucv2sdqhyp5r).
COMPUTE ucv3sdqbeht = 20 * MEAN.10(ucv3sdqhyp1, ucv3sdqemo1, ucv3sdqcon1, ucv3sdqper1, 
 ucv3sdqcon2r, ucv3sdqemo2, ucv3sdqhyp2, ucv3sdqper2r, ucv3sdqcon3, ucv3sdqemo3, ucv3sdqper3r, 
 ucv3sdqhyp3, ucv3sdqemo4, ucv3sdqcon4, ucv3sdqper4, ucv3sdqhyp4r, ucv3sdqcon5, ucv3sdqper5,
 ucv3sdqemo5, ucv3sdqhyp5r).
COMPUTE ucv4sdqbeht = 20 * MEAN.10(ucv4sdqhyp1, ucv4sdqemo1, ucv4sdqcon1, ucv4sdqper1, 
 ucv4sdqcon2r, ucv4sdqemo2, ucv4sdqhyp2, ucv4sdqper2r, ucv4sdqcon3, ucv4sdqemo3, ucv4sdqper3r, 
 ucv4sdqhyp3, ucv4sdqemo4, ucv4sdqcon4, ucv4sdqper4, ucv4sdqhyp4r, ucv4sdqcon5, ucv4sdqper5,
 ucv4sdqemo5, ucv4sdqhyp5r).
EXECUTE.

u1cselft1/2

Self Control total scale, from the twin phase 1 questionnaire, derived from all 6 available items of the Self Control measure (reversed where necessary).
Each item has integer response values 0-4, hence the scale has a range of values from 0 to 24 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cselft = 6 * MEAN.3(u1cself1r, u1cself2, u1cself3, u1cself4, u1cself5, u1cself6).
EXECUTE.

u1csexbriskt1/2

Risky sexual behaviour total scale, from the twin phase 1 questionnaire, derived from 5 of the 7 items of the measure.
Details of the derivation are explained in the syntax below. Briefly, each of items 2, 3, 4 and 6 is recoded to a 0-4 range, reversing where necessary such that 0=no risk and 4=highest risk. The scale is computed as the mean multiplied by the number of component items, hence it has a range 0-16. At least half the items are required to be non-missing for the scale to be computed. For twins who responded 0=no in screening item 1, a zero score is given.

* Risky sexual behaviour.
* Based on items 1, 2, 3, 4 and 6 of the sexual behaviour measure.
* Omit item 5 (other contraceptives) because it correlates negatively with item 4 and not at all with the others.
* Omit item 7 (HIV) as it has an odd negative correlation with item 1 and does not correlate with the others.
* Recode each of items 2, 3, 4 and 6 to a scale of 0-4, with 0=no risk and 4=highest risk.
* and in each case recoded from missing to 0 (no risk) if the item 1 response was 'no' (never had sex).

* Item 2 (raw coding 1-7): lower age=higher risk, so reverse the coding and convert to 0-4 scale.
* Highest age range response is 17 or older, corresponding to lowest risk level (recoded to 0).
COMPUTE u1csexb2R = (7 - u1csexb2) * 4 / 6.
EXECUTE.
* Item 3, number of sexual partners (raw coding 1-5) is coded in the right direction.
* rescale to 1-4 (1=1 person up to 4=15+ people).
* reserving 0 for the no-risk cases of sexual intercourse with 0 people.
RECODE u1csexb3
 (1=1) (2=1.75) (3=2.5) (4=3.25) (5=4)
INTO u1csexb3R.
EXECUTE.
* Item 4 (use of condoms) has coding 0-4 but needs reversing.
* (assume 'always' response is equivalent to zero risk).
RECODE u1csexb4
 (0=4) (1=3) (2=2) (3=1) (4=0)
INTO u1csexb4R.
EXECUTE.
* Item 6 needs no recoding: 0-4 with 0=no risk, but create a new variable to allow recoding from item 1.
COMPUTE u1csexb6R = u1csexb6.
EXECUTE.
* The 4 variables above are all missing if item 1 has response 0 (never had sex).
* so in this case recode them all to 0=no risk.
DO IF (u1csexb1 = 0).
 RECODE u1csexb2R u1csexb3R u1csexb4R u1csexb6R (SYSMIS=0).
END IF.
EXECUTE.
* Now create total scale from these four recoded variables.
COMPUTE u1csexbriskt = 4 * MEAN.2(u1csexb2R, u1csexb3R, u1csexb4R, u1csexb6R).
EXECUTE.
* drop temporary variables u1csexb2R u1csexb3R u1csexb4R u1csexb6R at the end of this script.

u1csexorn1/2

Recoded, ordinal sexual orientation item, with integer values from 1 (always attracted to opposite sex) up to 5 (always attracted to the same sex. This version of the item applies to both male and female twins. The raw item (u1csexor) was coded according to specific sexes, e.g. 1=always attracted to males. See comments in the syntax below for further details.

* Sexual orientation.
* Responses to this item were in terms of sex (male or female).
* Recode ordinally for twins of either sex into same-sex, opposite-sex, etc.
* always opposite sex.
IF (sex1 = 0 & u1csexor = 1) u1csexorn = 1.
IF (sex1 = 1 & u1csexor = 5) u1csexorn = 1.
* mostly opposite sex.
IF (sex1 = 0 & u1csexor = 2) u1csexorn = 2.
IF (sex1 = 1 & u1csexor = 4) u1csexorn = 2.
* equally same and opposite sex.
IF (u1csexor = 3) u1csexorn = 3.
* mostly same sex.
IF (sex1 = 0 & u1csexor = 4) u1csexorn = 4.
IF (sex1 = 1 & u1csexor = 2) u1csexorn = 4.
* always same sex.
IF (sex1 = 0 & u1csexor = 5) u1csexorn = 5.
IF (sex1 = 1 & u1csexor = 1) u1csexorn = 5.
EXECUTE.
* leave missing for raw responses of 6=little or no attraction.
* and 7=unsure, to preserve ordinal character of this derived variable.

u1cstexlikt1/2, u1cstexdevt1/2

Student Experiences subscales.
Derived from items 8 to 19 of the measure in the twin phase 1 questionnaire (items 1-7 are not suitable for scaling). The subscales are for liking of university (u1cstexlikt) and for personal and intellectual development (u1cstexdevt), based on 4 and 8 items respectively.
Each item has integer response values 0-4, hence each subscale has a range of values from 0 to (4 * number of component items) as it is a 'total' computed by multiplying the mean by the number of items. At least half the component items are required to be non-missing for each scale to be computed.

* Liking university subscale: items 8-11.
COMPUTE u1cstexlikt = 4 * MEAN.2(u1cstex08, u1cstex09, u1cstex10, u1cstex11).
* Personal and intellectual development: items 12-19.
COMPUTE u1cstexdevt = 8 * MEAN.4(u1cstex12, u1cstex13, u1cstex14,
 u1cstex15, u1cstex16, u1cstex17, u1cstex18, u1cstex19).
EXECUTE.

u1ctwnrm1/2

See u1cdadrm1/2, etc above.

u1cvolnt1/2, ucv1volnt1/2, ucv2volnt1/2, ucv3volnt1/2

Volunteering total scale, from the TEDS21 twin phase 1 (u1c) and covid twin phase 1 (ucv1) and phase 2 (ucv2) and phase 3 (ucv3) questionnaires. Derived from all available items of the measure (5 items in TEDS21, and a different 3 items in covid).
Each item has integer response values 0-4, hence each scale has a range of values from 0 to (4 x number of items) because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cvolnt = 5 * MEAN.3(u1cvoln1, u1cvoln2, u1cvoln3, u1cvoln4, u1cvoln5).
COMPUTE ucv1volnt = 3 * MEAN.2(ucv1voln1, ucv1voln2, ucv1voln3).
COMPUTE ucv2volnt = 3 * MEAN.2(ucv2voln1, ucv2voln2, ucv2voln3).
COMPUTE ucv3volnt = 3 * MEAN.2(ucv3voln1, ucv3voln2, ucv3voln3).
EXECUTE.

u1page

See u1cage1/2, etc above.

u1pconimpt1/2, u1pconinat1/2, u1pcont1/2

Conners ADHD scales.
Derived from items of the Conners measure in the parent phase 1 questionnaire.
The measure has a total scale (18 items) plus two subscales (9 items each).
Each item has integer response values 0-3, hence each scale has a range of values from 0 to (3 * number of component items) as it is a 'total' computed by multiplying the mean by the number of items. At least half the component items are required to be non-missing for each scale to be computed.

* Impulsivity (9 items).
COMPUTE u1pconimpt1 = 9 * MEAN.5(u1pcon101, u1pcon111, u1pcon121,
 u1pcon131, u1pcon141, u1pcon151, u1pcon161, u1pcon171, u1pcon181).
* Inattention (9 items).
COMPUTE u1pconinat1 = 9 * MEAN.5(u1pcon011, u1pcon021, u1pcon031,
 u1pcon041, u1pcon051, u1pcon061, u1pcon071, u1pcon081, u1pcon091).
* Total (all 18 items).
COMPUTE u1pcont1 = 18 * MEAN.9(u1pcon011, u1pcon021, u1pcon031,
 u1pcon041, u1pcon051, u1pcon061, u1pcon071, u1pcon081, u1pcon091,
 u1pcon101, u1pcon111, u1pcon121, u1pcon131, u1pcon141, u1pcon151,
 u1pcon161, u1pcon171, u1pcon181).
EXECUTE.

u1pdevmob, u1pdevwdth

Device categories used for the TEDS21 parent phase 1 questionaire, if completed electronically.
u1pdevmob flags mobile devices (mobile phones and tablets), coded 1=yes 0=no.
u1pdevwdth is a device width category, coded 1=small, 2=medium, 3=large, where width refers to the widest side of the device's screen.
Derived from raw item variables, collected on the web/app server, that are not retained in the dataset. Mobile devices could be categorised from both app and web data, using different methods. Screen sizes were only recorded in the web data, not the app data.

* Mobile devices: CMS data.
* ------------------------.
* The raw CMS variable PlatformType has values 'Web' or 'Mobile'.
* so we can assume 'Mobile' value refers to mobile devices.
* while 'Web' in most cases probably means a web browser used on a laptop/desktop.
RECODE PlatformType ('Web'=0) ('Mobile'=1)
INTO u1pdevmob.

* Mobile devices: web backup.
* --------------------------.
* Use substrings of the consenttechuseragent string to categorise broad device types.
* 'Windows', 'Macintosh' and 'X11' are probably all laptops or desktops.
IF (CHAR.INDEX(u1pconsenttechuseragent, 'Windows') > 0) devicetype = 1.
IF (CHAR.INDEX(u1pconsenttechuseragent, 'Macintosh') > 0) devicetype = 2.
IF (CHAR.INDEX(u1pconsenttechuseragent, 'X11') > 0) devicetype = 3.
EXECUTE.
* 'iPhone' and 'Android' are likely to be mobile phones.
* (in rare cases of Windows phones with Android installed, this supercedes 'Windows' above).
IF (CHAR.INDEX(u1pconsenttechuseragent, 'iPhone') > 0) devicetype = 4.
IF (CHAR.INDEX(u1pconsenttechuseragent, 'Android') > 0) devicetype = 5.
EXECUTE.
* 'iPad' and 'Tablet' are presumably tablets (the latter can overrule 'Windows').
IF (CHAR.INDEX(u1pconsenttechuseragent, 'iPad') > 0) devicetype = 6.
IF (CHAR.INDEX(u1pconsenttechuseragent, 'Tablet') > 0) devicetype = 7.
EXECUTE.
* 'Mobile' is another common substring, but always indicates mobile phones.
* categorised by other substrings above.

* Recode into a binary variable to flag mobile devices.
* to be retained in the dataset (raw device variables will be dropped).
RECODE devicetype (1 THRU 3=0) (4 THRU 7=1)
INTO u1pdevmob.
EXECUTE.

* Screen width: web backup only.
* ----------------------------.
* Some devices may be used in portrait or landscape mode.
* so for consistency treat the largest dimension as the screen 'width'.
IF (u1pconsenttechscrwidth >= u1pconsenttechscrheight) screenwidth = u1pconsenttechscrwidth.
IF (u1pconsenttechscrwidth < u1pconsenttechscrheight) screenwidth = u1pconsenttechscrheight.
EXECUTE.
* Categorise screen widths arbitrarily as small, medium or large.
RECODE screenwidth
 (LOWEST THRU 767=1) (768 THRU 1199=2) (1200 THRU HIGHEST=3)
INTO u1pdevwdth.
EXECUTE.

u1pLLCage1/2, u1pLLCdate1/2

See u1cLLCage1/2, etc above.

u1pparm1/2, u1pparnegm1/2, u1pparposm1/2

Parental feelings scales, derived from the 6 items of the measure in the phase 1 parent questionnaire.
The subscales are for negative feelings (u1pparnegm), and positive feelings (u1pparposm). The overall scale (u1pparm) is derived from both negative and positive feelings with coding in the same direction as negative feelings.
Each subscale is derived from 3 items. For each scale, at least half the component items are required to be non-missing. The overall scale is derived from the two subscales, hence it is derived indirectly from all 6 items.
Each item has response values 1-5, hence each of these scales has the same range as it is computed as a mean.

* Parental feelings.
* TEDS21 Phase 1, parent-rated, 6 items.
* Two subscales, for positive and negative feelings, 3 items each.
* and an overall scale from all 6 items.
* Negative feelings: items 1, 4, 6.
COMPUTE u1pparnegm1 = MEAN.2(u1ppar11, u1ppar41, u1ppar61).
* Positive feelings: items 2, 3, 5.
COMPUTE u1pparposm1 = MEAN.2(u1ppar21, u1ppar31, u1ppar51).
EXECUTE.
* Overall scale, coded in the negative feelings direction.
* derived from all 6 items, reversing each positive item.
* by subtracting it from 6 (because coded 1/2/3/4/5).
COMPUTE u1pparm1 = MEAN.4(u1ppar11, (6 - u1ppar21), (6 - u1ppar31), 
    u1ppar41, (6 - u1ppar51), u1ppar61).
EXECUTE.

u1prskt1/2

See u1crskt1/2, u1prskt1/2 above.

u1psant1/2

SANS total scale, from all 10 items in the phase 1 parent questionnaire. Each item has integer response values 0-4, hence each scale has a range of values from 0 to (4 * number of component items) as it is a 'total' computed by multiplying the mean by the number of items. At least half the component items are required to be non-missing for each scale to be computed.

COMPUTE u1psant1 = 10 * MEAN.5(u1psan011, u1psan021, u1psan031,
 u1psan041, u1psan051, u1psan061, u1psan071, u1psan081, u1psan091, u1psan101).
EXECUTE.

u1psdqemot1/2, u1psdqpert1/2, u1psdqhypt1/2, u1psdqcont1/2, u1psdqprot1/2, u1psdqbeht1/2

See u1csdqemot1/2, etc above.

u1pses

SES composite for parents, derived from 5 ordinal items in the phase 1 parent SES questionnaire. Each component, and the final composite, is standardised by cohort to eliminate significant cohort differences. Coding is such that higher values indicate higher SES. The derivation is explained by comments in the syntax.

* Derive parent SES composite from 5 components (household income, mother/father SOC and education).
* All components show significant cohort effects.
* so start by standardising them within each cohort.
SORT CASES  BY ucohort.
SPLIT FILE SEPARATE BY ucohort.

DESCRIPTIVES VARIABLES= u1pmosoc u1pfasoc u1pmohqual u1pfahqual u1pses5in
  /SAVE.
  
SPLIT FILE OFF.

* Reverse the parent SOC scores so high values = high SES.
COMPUTE Zu1pmosocR = -1 * Zu1pmosoc.
COMPUTE Zu1pfasocR = -1 * Zu1pfasoc.
EXECUTE.

* Parent SES composite is an equally-weighted mean of the 5 standardised components.
* with a requirement that at least 2 are non-missing.
COMPUTE parentses = MEAN.2(Zu1pmohqual, Zu1pfahqual, Zu1pses5in, Zu1pmosocR, Zu1pfasocR).
EXECUTE.

* Re-standardise the new composites to correct the SD to 1.
* and to correct any cohort differences that may have reappeared in the mean.
SORT CASES  BY ucohort.
SPLIT FILE SEPARATE BY ucohort.

DESCRIPTIVES VARIABLES= parentses (u1pses) 
  /SAVE.

SPLIT FILE OFF.

u2cage1/2

See u1cage1/2, etc above.

u2calco031/2, u2calco051/2, u2calcoaudit1/2, ucv1alco21/2, ucv2alco21/2, ucv3alco21/2, ucv4alco21/2

AUDIT scale for alcohol use, with associated recoded items measuring alcohol units.
u2calcoaudit is the alcohol use AUDIT total scale, derived from items 4-13 of the TEDS21 phase 2 twin questionnaire measure. This scale is designed to match the published AUDIT scale as closely as possible.
u2calco03, u2calco05, ucv1alco2, ucv2alco2, ucv3alco2 and ucv4alco2 are estimates of the total number of alcohol units consumed, derived from respective 4-part questions in the TEDS21 phase 2 questionnaire (u2c, questions 3 and 5) and in the Covid twin questionnaires (question 2, ucv1 is phase 1, ucv2 is phase 2, ucv3 is phase 3 and ucv4 is phase 4). Note that only one of these units variables (u2calco05) is subsequently used in the derivation of the AUDIT scale.
For the AUDIT scale, firstly, the four parts of item 5 are combined and recoded to values 0-4, as shown in the syntax below, to match the coding of the other nine items. The scale is then derived as the mean multiplied by the number of items (10), resulting in a range of values from 0 to 40. In the questionnaire there is a screening item, and if the response was 0=no then assign a zero value to the scale (otherwise it would be missing).
In deriving the mean, at least half the items are required to be non-missing for the scale to be computed.

* First convert the four parts of TEDS21 item 5 (and Covid item 2) into approximate numbers of units.
* by recoding response codes to the mid-range point of the number of drinks.
* for example 1-2 is 1.5, 2-4 is 3 and top of range 26+ is 30.
* Each glass of wine or pint of beer/cider is assumed to be 2 units, so multiply these by 2.
* Do the same for TEDS21 item 3 although the range is extended at the top end in this latter case.
RECODE u2calco03a u2calco03b u2calco05a u2calco05b
 ucv1alco2a ucv1alco2b ucv2alco2a ucv2alco2b ucv3alco2a ucv3alco2b ucv4alco2a ucv4alco2b
 (0=0) (1=3) (2=8) (3=16) (4=26) (5=36) (6=46) (7=60)
INTO u2calco3wineunits u2calco3beerunits u2calco5wineunits u2calco5beerunits
 ucv1wineunits ucv1beerunits ucv2wineunits ucv2beerunits 
 ucv3wineunits ucv3beerunits ucv4wineunits ucv4beerunits.
* Measures of alcopops and spirits are assumed to be 1 unit.
RECODE u2calco03c u2calco03d u2calco05c u2calco05d
 ucv1alco2c ucv1alco2d ucv2alco2c ucv2alco2d ucv3alco2c ucv3alco2d ucv4alco2c ucv4alco2d
 (0=0) (1=1.5) (2=4) (3=8) (4=13) (5=18) (6=23) (7=30)
INTO u2calco3alcopopunits u2calco3spiritunits u2calco5alcopopunits u2calco5spiritunits
 ucv1alcopopunits ucv1spiritunits ucv2alcopopunits ucv2spiritunits
 ucv3alcopopunits ucv3spiritunits ucv4alcopopunits ucv4spiritunits.
EXECUTE.
* Sum to get a total number of units, in the replacement variables.
* These will be retained in place of the raw items.
* round to an integer because these are approximate anyway.
COMPUTE u2calco03 = RND(SUM(u2calco3wineunits, u2calco3beerunits, u2calco3alcopopunits, u2calco3spiritunits)).
COMPUTE u2calco05 = RND(SUM(u2calco5wineunits, u2calco5beerunits, u2calco5alcopopunits, u2calco5spiritunits)).
COMPUTE ucv1alco2 = RND(SUM(ucv1wineunits, ucv1beerunits, ucv1alcopopunits, ucv1spiritunits)). 
COMPUTE ucv2alco2 = RND(SUM(ucv2wineunits, ucv2beerunits, ucv2alcopopunits, ucv2spiritunits)). 
COMPUTE ucv3alco2 = RND(SUM(ucv3wineunits, ucv3beerunits, ucv3alcopopunits, ucv3spiritunits)). 
COMPUTE ucv4alco2 = RND(SUM(ucv4wineunits, ucv4beerunits, ucv4alcopopunits, ucv4spiritunits)). 
EXECUTE.
* Recode number of units in Q5 to categories (0-4 scale).
* using the ranges in the published AUDIT scale.
RECODE u2calco05 
 (0 THRU 2=0) (2.1 THRU 4=1) (4.1 THRU 6=2) (6.1 THRU 9.9=3) (10 THRU HIGHEST=4)
INTO u2calco05un.
EXECUTE.
* Now create a total AUDIT score from items 4-13, all coded 0-4, including recoded item 5.
COMPUTE u2calcoaudit = 10 * MEAN.5(u2calco04, u2calco05un, u2calco06,
 u2calco07, u2calco08, u2calco09, u2calco10, u2calco11, u2calco12, u2calco13).
EXECUTE.
* Item 1 (ever had a drink) is a screening question: if 'no', other items are missing.
* The published AUDIT scale does not have a screening question.
* so assume scale value should be 0 if item 1 response is no.
IF (u2calco01 = 0) u2calcoaudit = 0.
EXECUTE.

u2cambit1/2

Ambition total scale, from the twin phase 2 questionnaire, derived from all 5 available items of the measure.
Each item has integer response values 0-4, hence the scale has a range of values from 0 to 20 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u2cambit = 5 * MEAN.3(u2cambi1, u2cambi2, u2cambi3, (4 - u2cambi4), u2cambi5).
EXECUTE.

u2cantbnonvt1/2, u2cantbt1/2, u2cantbviolt1/2

Antisocial behaviour scales, from the twin phase 2 questionnaire, derived from 12 of the 14 available items in the measure.
u2cantbnonvt1/2: count of non-violent, criminal behaviours.
u2cantbviolt1/2: count of violent, criminal behaviours.
u2cantbt1/2: overall total count.
Each is a count of the distinct behaviours reported rather than a conventional scale. The derivation is explained in comments in the syntax below.

* Antisocial behaviour.
* TEDS21 Phase 2, self-rated.
* The items do not correlate strongly so a conventional scale is not used.
* Instead, we derive counts of reported antisocial behaviours.
* in the two categories of violent and non-violent behaviours.
* using an approach consistent with that used in published literature.
* Each item is coded 0=no, 1=once, 2=more than once.
* For the scales, simply count positive responses (anything greater than 0).
* to give a measure of the number of antisocial, criminal behaviours reported.
* First make a temporary variable counting the number of responses of any type.
* in all the compulsory, non-branched items (not 8 or 13) of the measure.
COUNT u2cantbcount = u2cantb01 u2cantb02 u2cantb03 u2cantb04 u2cantb05 u2cantb06 
 u2cantb07 u2cantb09 u2cantb10 u2cantb11 u2cantb12 u2cantb14 u2cantb15 u2cantb16
 (0 THRU HIGHEST).
EXECUTE.
* Require over half (more than 7) of these items to be present.
DO IF (u2cantbcount > 7).
* Violent behaviours: 4 items.
 COUNT u2cantbviolt = u2cantb08 u2cantb09 u2cantb12 u2cantb13 (1 THRU HIGHEST).
* Non-violent behaviours: 10 items.
 COUNT u2cantbnonvt = u2cantb02 u2cantb03 u2cantb04 u2cantb05 u2cantb06 
  u2cantb07 u2cantb10 u2cantb14 u2cantb15 u2cantb16 (1 THRU HIGHEST).
END IF.
EXECUTE.
* Total: sum if both counts are non-missing.
COMPUTE u2cantbt = SUM.2(u2cantbviolt, u2cantbnonvt).
EXECUTE.
* Note that 2 items are not used: items 1 and 11.
* (a) because they are not used in the published scales.
* (b) because item 1 (rowdy/rude) is not a criminal behaviour, unlike the others.
* (c) because item 11 (harming animals) seems entirely uncorrelated with the others.
* Note that there are some items having very rare or even negligible responses.
* that are used above for scaling but are dropped as items from the dataset.

u2ccexpt1/2

Childhood Experiences total scale, from the twin phase 2 questionnaire, derived from all 8 available items of the measure.
Each item has integer response values 0-4, hence the scale has a range of values from 0 to 32 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u2ccexpt = 8 * MEAN.4(u2ccexp1, u2ccexp2, u2ccexp3,
 u2ccexp4, u2ccexp5, u2ccexp6, u2ccexp7, u2ccexp8).
EXECUTE.

u2ccgent1/2

Cognitive enhancers total scale, from the twin phase 2 questionnaire, derived from 3 of the 4 available items of the measure. Item 3 is omitted as it is branching and it has a different response pattern from the other items.
Each included item (1, 2 and 4) has integer response values 0-4, hence the scale has a range of values from 0 to 12 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u2ccgent = 3 * MEAN.2(u2ccgen1, u2ccgen2, u2ccgen4).
EXECUTE.

u2cconnt1/2, u2cconninat1/2, u2cconnhypt1/2

Conners, twin self-report, total scale and two subscales.
u2cconnt1/2 is the total scale from all 20 items.
u2cconninat1/2 is the inattention subscale from 11 items.
u2cconnhypt1/2 is the hyperactivity subscale from 9 items.
Each item has values 0/1/2/3, hence the scale values have ranges 0-60, 0-33 and 0-27 respectively. At least half the items are required to be non-missing for each scale to be computed.

* Conners (twin).
* TEDS21 Phase 2, self-rated, 20 items.
* Total scale and two subscales.
* Note this is a different version from the parent Conners.
* Inattention scale from the first 11 items.
COMPUTE u2cconninat = 11 * MEAN.6(u2cconn01, u2cconn02, u2cconn03, u2cconn04, 
 u2cconn05, u2cconn06, u2cconn07, u2cconn08, u2cconn09, u2cconn10, u2cconn11).
* Hyperactivity scale from the last 9 items.
COMPUTE u2cconnhypt = 9 * MEAN.5(u2cconn12, u2cconn13, u2cconn14, u2cconn15, 
 u2cconn16, u2cconn17, u2cconn18, u2cconn19, u2cconn20).
* Total scale from all 20 items.
COMPUTE u2cconnt = 20 * MEAN.10(u2cconn01, u2cconn02, u2cconn03, u2cconn04, 
 u2cconn05, u2cconn06, u2cconn07, u2cconn08, u2cconn09, u2cconn10, u2cconn11, 
 u2cconn12, u2cconn13, u2cconn14, u2cconn15, u2cconn16, u2cconn17, u2cconn18, 
 u2cconn19, u2cconn20).
EXECUTE.

u2ccrimt1/2

Criminality total scale, from the twin phase 2 questionnaire, derived from items 1 to 4 of the measure.
This is derived as an integer-valued ordinal scale, where a positive response in each item contributes 1 to the total value - hence the scale has values 0 to 4. Note that the measure includes branching, such that items 3 and 4 were only attempted by twins who gave a positive response in item 2.

* Criminality.
* Derive a simple integer 0-4 scale based on items 1-4.
* First sum the first two, compulsory items, requiring both to be present.
* both are coded 1Y 0N so this give a total 0-2.
COMPUTE u2ccrimt = SUM.2(u2ccrim1, u2ccrim2).
EXECUTE.
* Items 3 and 4 branch from item 2, so are missing if item 2 was 'no'.
* Add an extra 1 to the scale if the response to item 3 was > 1.
* (arrested more than once).
IF (u2ccrimt > 0 & u2ccrim3 > 1) u2ccrimt = u2ccrimt + 1.
EXECUTE.
* Similarly, add 1 to the scale if the response to item 4 was > 0.
* (spent at least one night in a police cell).
IF (u2ccrimt > 0 & u2ccrim4 > 0) u2ccrimt = u2ccrimt + 1.
EXECUTE.
* This derivation allows for missing data in items 3/4 but not in items 1/2.

u2cdevmob1/2, u2cdevwdth1/2

Device categories used for the TEDS21 twin phase 2 questionaire, if completed electronically.
u2cdevmob flags mobile devices (mobile phones and tablets), coded 1=yes 0=no.
u2cdevwdth is a device width category, coded 1=small, 2=medium, 3=large, where width refers to the widest side of the device's screen.
Derived from raw item variables, collected on the web/app server, that are not retained in the dataset. Mobile devices could be categorised from both app and web data, using different methods. Screen sizes were only recorded in the web data, not the app data.

* Mobile devices: CMS data.
* ------------------------.
* The raw CMS variable PlatformType has values 'Web' or 'Mobile'.
* so we can assume 'Mobile' value refers to mobile devices.
* while 'Web' in most cases probably means a web browser used on a laptop/desktop.
RECODE PlatformType ('Web'=0) ('Mobile'=1)
INTO u2cdevmob.

* Mobile devices: web backup.
* --------------------------.
* Use substrings of the u2cconsenttechuseragent string to categorise broad device types.
* 'Windows', 'Macintosh' and 'X11' are probably all laptops or desktops.
IF (CHAR.INDEX(u2cconsenttechuseragent, 'Windows') > 0) devicetype = 1.
IF (CHAR.INDEX(u2cconsenttechuseragent, 'Macintosh') > 0) devicetype = 2.
IF (CHAR.INDEX(u2cconsenttechuseragent, 'X11') > 0) devicetype = 3.
EXECUTE.
* 'iPhone' and 'Android' are likely to be mobile phones.
* (in rare cases of Windows phones with Android installed, this supercedes 'Windows' above).
IF (CHAR.INDEX(u2cconsenttechuseragent, 'iPhone') > 0) devicetype = 4.
IF (CHAR.INDEX(u2cconsenttechuseragent, 'Android') > 0) devicetype = 5.
EXECUTE.
* 'iPad' and 'Tablet' are presumably tablets (the latter can overrule 'Windows').
IF (CHAR.INDEX(u2cconsenttechuseragent, 'iPad') > 0) devicetype = 6.
IF (CHAR.INDEX(u2cconsenttechuseragent, 'Tablet') > 0) devicetype = 7.
EXECUTE.
* 'Mobile' is another common substring, but always indicates mobile phones.
* categorised by other substrings above.

* Recode into a binary variable to flag mobile devices.
* to be retained in the dataset (raw device variables will be dropped).
RECODE devicetype (1 THRU 3=0) (4 THRU 7=1)
INTO u2cdevmob.
EXECUTE.

* Screen width: web backup only.
* ----------------------------.
* Some devices may be used in portrait or landscape mode.
* so for consistency treat the largest dimension as the screen 'width'.
IF (u2cconsenttechscrwidth >= u2cconsenttechscrheight) screenwidth = u2cconsenttechscrwidth.
IF (u2cconsenttechscrwidth < u2cconsenttechscrheight) screenwidth = u2cconsenttechscrheight.
EXECUTE.
* Categorise screen widths arbitrarily as small, medium or large.
RECODE screenwidth
 (LOWEST THRU 767=1) (768 THRU 1199=2) (1200 THRU HIGHEST=3)
INTO u2cdevwdth.
EXECUTE.

u2cganxt1/2, ucv1ganxt1/2, ucv2ganxt1/2, ucv3ganxt1/2, ucv4ganxt1/2

General anxiety (GAD-D) total scale, from the TEDS21 twin phase 2 questionnaire (u2cganxt) and from the twin covid phase 1 (ucv1ganxt), phase 2 (ucv2ganxt), phase 3 (ucv3ganxt) and phase 4 (ucv4ganxt) questionnaires, in each case derived from all 10 available items of the measure.
Each item has integer response values 0-4, hence the scale has a range of values from 0 to 40 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u2cganxt = 10 * MEAN.5(u2cganx01, u2cganx02, u2cganx03,
 u2cganx04, u2cganx05, u2cganx06, u2cganx07, u2cganx08, u2cganx09, u2cganx10).
COMPUTE ucv1ganxt = 10 * MEAN.5(ucv1ganx01, ucv1ganx02, ucv1ganx03,
 ucv1ganx04, ucv1ganx05, ucv1ganx06, ucv1ganx07, ucv1ganx08, ucv1ganx09, ucv1ganx10).
COMPUTE ucv2ganxt = 10 * MEAN.5(ucv2ganx01, ucv2ganx02, ucv2ganx03,
 ucv2ganx04, ucv2ganx05, ucv2ganx06, ucv2ganx07, ucv2ganx08, ucv2ganx09, ucv2ganx10).
COMPUTE ucv3ganxt = 10 * MEAN.5(ucv3ganx01, ucv3ganx02, ucv3ganx03,
 ucv3ganx04, ucv3ganx05, ucv3ganx06, ucv3ganx07, ucv3ganx08, ucv3ganx09, ucv3ganx10).
COMPUTE ucv4ganxt = 10 * MEAN.5(ucv4ganx01, ucv4ganx02, ucv4ganx03,
 ucv4ganx04, ucv4ganx05, ucv4ganx06, ucv4ganx07, ucv4ganx08, ucv4ganx09, ucv4ganx10).
EXECUTE.

u2chasst1/2

Hassles total scale, from the twin phase 2 questionnaire, derived from all 7 available items of the measure.
Each item has integer response values 0-4, hence the scale has a range of values from 0 to 28 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u2chasst = 7 * MEAN.4(u2chass1, u2chass2,
 u2chass3, u2chass4, u2chass5, u2chass6, u2chass7).
EXECUTE.

u2cleism1/2

Leisure and hobbies overall mean scale, derived from all 5 items of the measure in the phase 2 twin questionnaire.
At least half the component items are required to be non-missing. Each item has response values 1-5, hence the scale has the same range as it is computed as a mean.

COMPUTE u2cleism = MEAN.3(u2cleis1, u2cleis2, u2cleis3, u2cleis4, u2cleis5).
EXECUTE.

u2clfevt1/2, u2clfevnnt1/2, u2clfevnat1/2

Life Events scales, from the twin phase 2 questionnaire, derived from all 11 available items of the measure.
u2clfevt1/2 is a conventional scale representing a total score; each item has integer response values 0-4, hence this scale has a range of values from 0 to 44 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.
u2clfevnnt1/2 and u2clfevnat1/2 are counts of reported life events: the former is a count of events reported with little or no effect (response values 1, 2); the latter is a count of events reported with some effect (response values 3, 4). Each of these two scales may have integer values between 0 and 11, because the measure includes 11 event items. Note that all the events in this measure at this age are treated as negative events.

* Total scale from all 11 items, as a mean.
COMPUTE u2clfevt = 11 * MEAN.6(u2clfev01, u2clfev02, u2clfev03, u2clfev04,
 u2clfev05, u2clfev06, u2clfev07, u2clfev08, u2clfev09, u2clfev10, u2clfev11).
EXECUTE.

* Correlations between items are low because these are mostly independent events.
* So as an alternative to the scale add counts of events, as at age 26.
* First count the number of all responses in the measure, including 'no' responses.
COUNT u2clfevcount = u2clfev01 u2clfev02 u2clfev03 u2clfev04 u2clfev05 u2clfev06 
 u2clfev07 u2clfev08 u2clfev09 u2clfev10 u2clfev11 (0 THRU HIGHEST).
EXECUTE.
* Counts of events may be invalid if too many are missing.
* so require at least 9 of the 11 items to be answered (this excludes only 2 twins).
DO IF (u2clfevcount >= 9).
 * Count events that occurred but had little or no effect on the twin.
 COUNT u2clfevnnt = u2clfev01 u2clfev02 u2clfev03 u2clfev04 u2clfev05 u2clfev06 
  u2clfev07 u2clfev08 u2clfev09 u2clfev10 u2clfev11 (1, 2).
 * Count events that affected the twin (moderately or a lot).
 COUNT u2clfevnat = u2clfev01 u2clfev02 u2clfev03 u2clfev04 u2clfev05 u2clfev06 
  u2clfev07 u2clfev08 u2clfev09 u2clfev10 u2clfev11 (3, 4).
END IF.
EXECUTE.

u2cLLCage1/2, u2cLLCdate1/2

See u1cLLCage1/2, etc above.

u2clrssparm1/2, u2clrssoccm1/2, u2clrsshomm1/2

Life Role Salience (LRSS) subscales, derived from items of the measure in the phase 2 twin questionnaire.
The subscales are for parental roles (u2clrssparm), occupational roles (u2clrssoccm) and home care roles (u2clrsshomm).
Each subscale is derived from 4 items (with item 6 reversed). For each scale, at least half the component items are required to be non-missing. Each item has response values 1-5, hence each of these scales has the same range as it is computed as a mean.

* Parental role: items 1,4,7,10.
COMPUTE u2clrssparm = MEAN.2(u2clrss01, u2clrss04, u2clrss07, u2clrss10).
* Occupational role: items 2,5,8,11.
COMPUTE u2clrssoccm = MEAN.2(u2clrss02, u2clrss05, u2clrss08, u2clrss11).
* Home care role: items 3,6,9,12 (with item 6 reversed).
COMPUTE u2clrsshomm = MEAN.2(u2clrss03, (6 - u2clrss06), u2clrss09, u2clrss12).
EXECUTE.

u2cmfqt1/2

See u1cmfqt1/2, u2cmfqt1/2 above.

u2cperpphyt1/2, u2cperpsoct1/2, u2cperpvert1/2, u2cperpcybt1/2, u2cperpt1/2, u2cvictphyt1/2, u2cvictsoct1/2, u2cvictvert1/2, u2cvictcybt1/2, u2cvictt1/2, ucv1victphyt1/2, ucv1victvert1/2, ucv1victcybt1/2, ucv1victt1/2, ucv2victphyt1/2, ucv2victvert1/2, ucv2victcybt1/2, ucv2victt1/2, ucv3victphyt1/2, ucv3victvert1/2, ucv3victcybt1/2, ucv3victt1/2, ucv4victphyt1/2, ucv4victvert1/2, ucv4victcybt1/2, ucv4victt1/2

Victimisation and Perpetration subscales, derived from items of the two closely-related measures in the TEDS21 phase 2 twin questionnaire, and the shortened victimisation scale in the covid questionnaires.
The TEDS21 victimisation and perpetration measures have essentially the same 16 items, rephrased for victimisation by peers of the twin (u2cvict) then for victimisation towards others perpetrated by the twin (u2cperp). The same set of scales and subscales is therefore derived for the two measures. The covid questionnaires (ucv1vict, ucv2vict, ucv3vict, ucv4vict) had a subset of 12 of the 16 victimisation items, omitting the four social items and omitting the perpetration items.
The subscales are for physical (phyt), social (soct), verbal (vert) and cyber-media (cybt) victimisation, each derived as a mean from 4 of the items. There is also an overall total scale (u2cperpt, u2cvictt) derived from all 16 items, or in the case of the covid questionnaire (ucv1victt, ucv2victt, ucv3victt, ucv4victt) derived from all available 12 items.
For each scale, at least half the component items are required to be non-missing. Each item has response values 0-2, hence each scale has a range of values from 0 to (2 * number of items) because it is computed as the mean multiplied by the number of component items.

* Total scale and four subscales.
* Physical victimisation subscale: items 1/5/9/13 in TEDS21, items 1/4/7/10 in covid.
COMPUTE u2cvictphyt = 4 * MEAN.2(u2cvict01, u2cvict05, u2cvict09, u2cvict13).
COMPUTE u2cperpphyt = 4 * MEAN.2(u2cperp01, u2cperp05, u2cperp09, u2cperp13).
COMPUTE ucv1victphyt = 4 * MEAN.2(ucv1vict01, ucv1vict04, ucv1vict07, ucv1vict10).
COMPUTE ucv2victphyt = 4 * MEAN.2(ucv2vict01, ucv2vict04, ucv2vict07, ucv2vict10).
COMPUTE ucv3victphyt = 4 * MEAN.2(ucv3vict01, ucv3vict04, ucv3vict07, ucv3vict10).
COMPUTE ucv4victphyt = 4 * MEAN.2(ucv4vict01, ucv4vict04, ucv4vict07, ucv4vict10).
EXECUTE.
* Social victimisation subscale: items 2/6/10/14 in TEDS21.
COMPUTE u2cvictsoct = 4 * MEAN.2(u2cvict02, u2cvict06, u2cvict10, u2cvict14).
COMPUTE u2cperpsoct = 4 * MEAN.2(u2cperp02, u2cperp06, u2cperp10, u2cperp14).
EXECUTE.
* Verbal victimisation subscale:  items 3/7/11/15 in TEDS21, items 2/5/8/11 in covid.
COMPUTE u2cvictvert = 4 * MEAN.2(u2cvict03, u2cvict07, u2cvict11, u2cvict15).
COMPUTE u2cperpvert = 4 * MEAN.2(u2cperp03, u2cperp07, u2cperp11, u2cperp15).
COMPUTE ucv1victvert = 4 * MEAN.2(ucv1vict02, ucv1vict05, ucv1vict08, ucv1vict11).
COMPUTE ucv2victvert = 4 * MEAN.2(ucv2vict02, ucv2vict05, ucv2vict08, ucv2vict11).
COMPUTE ucv3victvert = 4 * MEAN.2(ucv3vict02, ucv3vict05, ucv3vict08, ucv3vict11).
COMPUTE ucv4victvert = 4 * MEAN.2(ucv4vict02, ucv4vict05, ucv4vict08, ucv4vict11).
EXECUTE.
* Cyber-victimisation subscale: items 4/8/12/16 in TEDS21, items 3/6/9/12 in covid.
COMPUTE u2cvictcybt = 4 * MEAN.2(u2cvict04, u2cvict08, u2cvict12, u2cvict16).
COMPUTE u2cperpcybt = 4 * MEAN.2(u2cperp04, u2cperp08, u2cperp12, u2cperp16).
COMPUTE ucv1victcybt = 4 * MEAN.2(ucv1vict03, ucv1vict06, ucv1vict09, ucv1vict12).
COMPUTE ucv2victcybt = 4 * MEAN.2(ucv2vict03, ucv2vict06, ucv2vict09, ucv2vict12).
COMPUTE ucv3victcybt = 4 * MEAN.2(ucv3vict03, ucv3vict06, ucv3vict09, ucv3vict12).
COMPUTE ucv4victcybt = 4 * MEAN.2(ucv4vict03, ucv4vict06, ucv4vict09, ucv4vict12).
EXECUTE.
* Overall total from all 16 items (TEDS21).
* or 12 items (Covid).
COMPUTE u2cvictt = 16 * MEAN.8(u2cvict01, u2cvict02, u2cvict03, u2cvict04,
 u2cvict05, u2cvict06, u2cvict07, u2cvict08, u2cvict09, u2cvict10,
 u2cvict11, u2cvict12, u2cvict13, u2cvict14, u2cvict15, u2cvict16).
COMPUTE u2cperpt = 16 * MEAN.8(u2cperp01, u2cperp02, u2cperp03, u2cperp04,
 u2cperp05, u2cperp06, u2cperp07, u2cperp08, u2cperp09, u2cperp10,
 u2cperp11, u2cperp12, u2cperp13, u2cperp14, u2cperp15, u2cperp16).
COMPUTE ucv1victt = 12 * MEAN.6(ucv1vict01, ucv1vict02, ucv1vict03, ucv1vict04,
 ucv1vict05, ucv1vict06, ucv1vict07, ucv1vict08, ucv1vict09, ucv1vict10,
 ucv1vict11, ucv1vict12).
COMPUTE ucv2victt = 12 * MEAN.6(ucv2vict01, ucv2vict02, ucv2vict03, ucv2vict04,
 ucv2vict05, ucv2vict06, ucv2vict07, ucv2vict08, ucv2vict09, ucv2vict10,
 ucv2vict11, ucv2vict12).
COMPUTE ucv3victt = 12 * MEAN.6(ucv3vict01, ucv3vict02, ucv3vict03, ucv3vict04,
 ucv3vict05, ucv3vict06, ucv3vict07, ucv3vict08, ucv3vict09, ucv3vict10,
 ucv3vict11, ucv3vict12).
COMPUTE ucv4victt = 12 * MEAN.6(ucv4vict01, ucv4vict02, ucv4vict03, ucv4vict04,
 ucv4vict05, ucv4vict06, ucv4vict07, ucv4vict08, ucv4vict09, ucv4vict10,
 ucv4vict11, ucv4vict12).
EXECUTE.

u2cslpqt1/2

Sleep quality total scale, from the twin phase 2 questionnaire, derived from all 8 available items of the measure.
Each item has integer response values 0-3, hence the scale has a range of values from 0 to 24 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u2cslpqt = 8 * MEAN.4(u2cslpq1, u2cslpq2, u2cslpq3,
 u2cslpq4, u2cslpq5, u2cslpq6, u2cslpq7, u2cslpq8).
EXECUTE.

u2cspeqpart1/2, u2cspeqhalt1/2

Specific Psychotic Experiences Questionnaire (SPEQ) subscales, derived from items of the measure in the phase 2 twin questionnaire.
The subscales are for paranoia (u2cspeqpart), and hallucinations (u2cspeqhalt).
The subscales are derived from 15 and 9 items respectively. For each scale, at least half the component items are required to be non-missing. Each item has response values 0-5, hence each scale has a range of values from 0 to (5 * number of items) because it is computed as the mean multiplied by the number of component items.

* Paranoia subscale: items 1-15.
COMPUTE u2cspeqpart = 15 * MEAN.8(u2cspeq01, u2cspeq02, u2cspeq03, 
 u2cspeq04, u2cspeq05, u2cspeq06, u2cspeq07, u2cspeq08, u2cspeq09, 
 u2cspeq10, u2cspeq11, u2cspeq12, u2cspeq13, u2cspeq14, u2cspeq15).
* Hallucinations subscale: items 16-24.
COMPUTE u2cspeqhalt = 9 * MEAN.5(u2cspeq16, u2cspeq17, u2cspeq18, 
 u2cspeq19, u2cspeq20, u2cspeq21, u2cspeq22, u2cspeq23, u2cspeq24).
EXECUTE.

u2cvictphyt1/2, u2cvictsoct1/2, u2cvictvert1/2, u2cvictcybt1/2, u2cvictt1/2

See u2cperpphyt1/2, etc above.

ucgage1/2

Age of twin (in decimal years) when the g-game was started.
Derived from ucgconstdt (g-game consent start date) and aonsdob (twin birth date). These date variables are not retained in the dataset.

COMPUTE ucgage = RND((DATEDIFF(ucgconstdt, aonsdob, "days")) / 365.25, 0.1) .
EXECUTE.

ucgdevmob1/2, ucgdevwdth1/2

Device categories used for the g-game web tests.
ucgdevmob flags mobile devices (mobile phones and tablets), coded 1=yes 0=no.
ucgdevwdth is a device width category, coded 1=small, 2=medium, 3=large, where width refers to the widest side of the device's screen.
Derived from raw item variables, collected on the web server, that are not retained in the dataset: consent__techuseragent (a complex string denoting the type of user agent), consent__techscrwidth and consent__techscrheight (screen dimensions in pixels).

* Use the consent__techuseragent field to identify crude device types from substrings.
* 'Windows', 'Macintosh' and 'X11' are probably all laptops or desktops.
IF (CHAR.INDEX(consent__techuseragent, 'Windows') > 0) devicetype = 1.
IF (CHAR.INDEX(consent__techuseragent, 'Macintosh') > 0) devicetype = 2.
IF (CHAR.INDEX(consent__techuseragent, 'X11') > 0) devicetype = 3.
EXECUTE.
* 'iPhone' and 'Android' are likely to be mobile phones.
IF (CHAR.INDEX(consent__techuseragent, 'iPhone') > 0) devicetype = 4.
IF (CHAR.INDEX(consent__techuseragent, 'Android') > 0) devicetype = 5.
EXECUTE.
* 'iPad' and 'Tablet' are presumably tablets (the latter can overrule 'Windows' and 'Android' above).
IF (CHAR.INDEX(consent__techuseragent, 'iPad') > 0) devicetype = 6.
IF (CHAR.INDEX(consent__techuseragent, 'Tablet') > 0) devicetype = 7.
EXECUTE.
* Recode into a binary variable to flag mobile devices.
* to be retained in the dataset (raw device variables will be dropped).
RECODE devicetype (1 THRU 3=0) (4 THRU 7=1)
INTO ucgdevmob.
EXECUTE.

* Some devices may be used in portrait or landscape mode.
* so for consistency treat the largest dimension as the screen 'width'.
IF (consent__techscrwidth >= consent__techscrheight) screenwidth = consent__techscrwidth.
IF (consent__techscrwidth < consent__techscrheight) screenwidth = consent__techscrheight.
EXECUTE.
* Categorise screen widths arbitrarily as small, medium or large.
RECODE screenwidth
 (LOWEST THRU 767=1) (768 THRU 1199=2) (1200 THRU HIGHEST=3)
INTO ucgdevwdth.
EXECUTE.

ucgLLCage1/2, ucgLLCdate1/2

See u1cLLCage1/2, etc above.

ucgnvt1/2, ucgvbt1/2, ucgt1/2

G-game total scores for verbal ability (ucgvbt), non-verbal ability (ucgnvt) and overall general cognitive ability or 'g' (ucgt).
Each is derived as the sum of the relevant sub-test scores, if all completed.

* The g-game is designed to have equal weighting of verbal and non-verbal items/scores (20 each).
* Therefore create simple sums as scores for verbal, non-verbal and g.
* requiring all relevant sub-tests to be non-missing.
COMPUTE ucgnvt = SUM.2(ucgisttot, ucgravtot).
COMPUTE ucgvbt = SUM.3(ucgvoctot, ucgmistot, ucgvertot).
COMPUTE ucgt = SUM.2(ucgnvt, ucgvbt).
EXECUTE.

ucv1actvm1/2

See u1cactvm1/2, etc above.

ucv1age1/2, ucv2age1/2, ucv3age1/2, ucv4age1/2

Age of twin (in decimal years) at the start of the respective covid questionnaires in phase 1 (ucv1age), phase 2 (ucv2age), phase 3 (ucv3age) and phase 4 (ucv4age).
Derived from ucv1constdt/ucv2constdt/ucv3constdt/ucv4constdt (consent start date) and aonsdob (twin birth date). These date variables are not retained in the dataset.

COMPUTE ucv1age = RND((DATEDIFF(ucv1constdt, aonsdob, "days")) / 365.25, 0.1) .
COMPUTE ucv2age = RND((DATEDIFF(ucv2constdt, aonsdob, "days")) / 365.25, 0.1) .
COMPUTE ucv3age = RND((DATEDIFF(ucv3constdt, aonsdob, "days")) / 365.25, 0.1) .
COMPUTE ucv4age = RND((DATEDIFF(ucv4constdt, aonsdob, "days")) / 365.25, 0.1) .
EXECUTE.

ucv1alco2un1/2

See u2calcoaudit1/2, etc above.

ucv1commm1/2

See u1ccommm1/2, etc above.

ucv1devmob1/2, ucv1devwdth1/2, ucv2devmob1/2, ucv2devwdth1/2, ucv3devmob1/2, ucv3devwdth1/2, ucv4devmob1/2, ucv4devwdth1/2

Device categories used for the web covid phase 1 (ucv1), phase 2 (ucv2), phase 3 (ucv3) and phase 4 (ucv4) questionnaires.
ucvXdevmob flags mobile devices (mobile phones and tablets), coded 1=yes 0=no.
ucvXdevwdth is a device width category, coded 1=small, 2=medium, 3=large, where width refers to the widest side of the device's screen.
Derived from raw item variables, collected on the web server, that are not retained in the dataset: consent__techuseragent (a complex string denoting the type of user agent), consent__techscrwidth and consent__techscrheight (screen dimensions in pixels).

* Phase 1.
* -------.
* Use the consent__techuseragent field to identify crude device types from substrings.
* 'Windows', 'Macintosh' and 'X11' are probably all laptops or desktops.
IF (CHAR.INDEX(consent__techuseragent, 'Windows') > 0) devicetype = 1.
IF (CHAR.INDEX(consent__techuseragent, 'Macintosh') > 0) devicetype = 2.
IF (CHAR.INDEX(consent__techuseragent, 'X11') > 0) devicetype = 3.
EXECUTE.
* 'iPhone' and 'Android' are likely to be mobile phones.
IF (CHAR.INDEX(consent__techuseragent, 'iPhone') > 0) devicetype = 4.
IF (CHAR.INDEX(consent__techuseragent, 'Android') > 0) devicetype = 5.
EXECUTE.
* 'iPad' and 'Tablet' are presumably tablets (the latter can overrule 'Windows' and 'Android' above).
IF (CHAR.INDEX(consent__techuseragent, 'iPad') > 0) devicetype = 6.
IF (CHAR.INDEX(consent__techuseragent, 'Tablet') > 0) devicetype = 7.
EXECUTE.
* Recode into a binary variable to flag mobile devices.
* to be retained in the dataset (raw device variables will be dropped).
RECODE devicetype (1 THRU 3=0) (4 THRU 7=1)
INTO ucv1devmob.
EXECUTE.

* Some devices may be used in portrait or landscape mode.
* so for consistency treat the largest dimension as the screen 'width'.
IF (consent__techscrwidth >= consent__techscrheight) screenwidth = consent__techscrwidth.
IF (consent__techscrwidth < consent__techscrheight) screenwidth = consent__techscrheight.
EXECUTE.
* Categorise screen widths arbitrarily as small, medium or large.
RECODE screenwidth
 (LOWEST THRU 767=1) (768 THRU 1199=2) (1200 THRU HIGHEST=3)
INTO ucv1devwdth.
EXECUTE.

* Phase 2.
* -------.
* Repeat the syntax above to derive devicetype and screenwidth in exactly the same way.
* from the phase 2 raw data file - then derive the dataset variables as follows.
RECODE devicetype (1 THRU 3=0) (4 THRU 7=1)
INTO ucv2devmob.
EXECUTE.
RECODE screenwidth
 (LOWEST THRU 767=1) (768 THRU 1199=2) (1200 THRU HIGHEST=3)
INTO ucv2devwdth.
EXECUTE.

* Phase 3.
* -------.
* Repeat the syntax above to derive devicetype and screenwidth in exactly the same way.
* from the phase 3 raw data file - then derive the dataset variables as follows.
RECODE devicetype (1 THRU 3=0) (4 THRU 7=1)
INTO ucv3devmob.
EXECUTE.
RECODE screenwidth
 (LOWEST THRU 767=1) (768 THRU 1199=2) (1200 THRU HIGHEST=3)
INTO ucv3devwdth.
EXECUTE.

* Phase 4.
* -------.
* Repeat the syntax above to derive devicetype and screenwidth in exactly the same way.
* from the phase 4 raw data file - then derive the dataset variables as follows.
RECODE devicetype (1 THRU 3=0) (4 THRU 7=1)
INTO ucv4devmob.
EXECUTE.
RECODE screenwidth
 (LOWEST THRU 767=1) (768 THRU 1199=2) (1200 THRU HIGHEST=3)
INTO ucv4devwdth.
EXECUTE.