UScript
class UScript
kotlin.Any | |
↳ | android.icu.lang.UScript |
Constants for ISO 15924 script codes, and related functions.
The current set of script code constants supports at least all scripts that are encoded in the version of Unicode which ICU currently supports. The names of the constants are usually derived from the Unicode script property value aliases. See UAX #24 Unicode Script Property (http://www.unicode.org/reports/tr24/) and http://www.unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt .
In addition, constants for many ISO 15924 script codes are included, for use with language tags, CLDR data, and similar. Some of those codes are not used in the Unicode Character Database (UCD). For example, there are no characters that have a UCD script property value of Hans or Hant. All Han ideographs have the Hani script property value in Unicode.
Private-use codes Qaaa..Qabx are not included, except as used in the UCD or in CLDR.
Starting with ICU 55, script codes are only added when their scripts have been or will certainly be encoded in Unicode, and have been assigned Unicode script property value aliases, to ensure that their script names are stable and match the names of the constants. Script codes like Latf and Aran that are not subject to separate encoding may be added at any time.
Summary
Nested classes | |
---|---|
Script usage constants. |
Constants | |
---|---|
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Arabic |
static Int |
Armenian |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Bengali |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Bopomofo |
static Int |
ISO 15924 script code |
static Int |
Braille Script in Unicode 4 |
static Int |
Script in Unicode 4. |
static Int |
Buhid |
static Int |
Unified Canadian Aboriginal Symbols |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Cherokee |
static Int | |
static Int |
ISO 15924 script code |
static Int |
Common |
static Int |
Coptic |
static Int |
ISO 15924 script code |
static Int |
Cypriot Script in Unicode 4 |
static Int | |
static Int |
Cyrillic |
static Int |
ISO 15924 script code |
static Int |
Deseret |
static Int |
Devanagari |
static Int | |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int | |
static Int |
ISO 15924 script code |
static Int |
Ethiopic |
static Int |
Georgian |
static Int |
Script in Unicode 4. |
static Int |
Gothic |
static Int |
ISO 15924 script code |
static Int |
Greek |
static Int |
Gujarati |
static Int | |
static Int |
Gurmukhi |
static Int |
Han |
static Int |
Hangul |
static Int | |
static Int |
Hanunooo |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Hebrew |
static Int |
ISO 15924 script code |
static Int |
Hiragana |
static Int |
ISO 15924 script code |
static Int |
Inherited |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Invalid code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Kannada |
static Int |
Katakana |
static Int |
Script in Unicode 4. |
static Int | |
static Int |
ISO 15924 script code |
static Int |
Script in Unicode 4. |
static Int | |
static Int |
Khmer |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Lao |
static Int |
Latin |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Limbu Script in Unicode 4 |
static Int |
ISO 15924 script code |
static Int |
Linear B Script in Unicode 4 |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int | |
static Int |
Malayalam |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int | |
static Int |
ISO 15924 script code |
static Int |
Mende Kikakui ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Mangolian |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Myammar |
static Int |
ISO 15924 script code |
static Int | |
static Int |
ISO 15924 script code |
static Int | |
static Int |
ISO 15924 script code |
static Int |
Script in Unicode 4. |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int | |
static Int |
Ogham |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Old Itallic |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Script in Unicode 4. |
static Int | |
static Int |
ISO 15924 script code |
static Int | |
static Int |
ISO 15924 script code |
static Int |
Oriya |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Osmanya Script in Unicode 4 |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Runic |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Shavian Script in Unicode 4 |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code for Sutton SignWriting |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Sinhala |
static Int | |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Script in Unicode 4. |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Syriac |
static Int |
Tagalog |
static Int |
Tagbanwa |
static Int |
Tai Le Script in Unicode 4 |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
Tamil |
static Int | |
static Int |
ISO 15924 script code |
static Int |
Telugu |
static Int |
ISO 15924 script code |
static Int |
Thana |
static Int |
Thai |
static Int |
Tibetan |
static Int |
Script in Unicode 4. |
static Int |
ISO 15924 script code |
static Int | |
static Int |
ISO 15924 script code |
static Int |
Unified Canadian Aboriginal Symbols (alias) |
static Int |
Ugaritic Script in Unicode 4 |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int | |
static Int | |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int |
ISO 15924 script code |
static Int | |
static Int |
Yi syllables |
static Int |
ISO 15924 script code |
Public methods | |
---|---|
static Boolean |
breaksBetweenLetters(script: Int) Returns true if the script allows line breaks between letters (excluding hyphenation). |
static IntArray! |
Gets a script codes associated with the given locale or ISO 15924 abbreviation or name. |
static IntArray! |
Gets a script codes associated with the given locale or ISO 15924 abbreviation or name. |
static IntArray! |
Gets the script codes associated with the given locale or ISO 15924 abbreviation or name. |
static Int |
getCodeFromName(nameOrAbbr: String!) Returns the script code associated with the given Unicode script property alias (name or abbreviation). |
static String! |
Returns the long Unicode script name, if there is one. |
static String! |
getSampleString(script: Int) Returns the script sample character string. |
static Int |
Gets the script code associated with the given codepoint. |
static Int |
getScriptExtensions(c: Int, set: BitSet!) Sets code point c's Script_Extensions as script code integers into the output BitSet. |
static String! |
getShortName(scriptCode: Int) Returns the 4-letter ISO 15924 script code, which is the same as the short Unicode script name if Unicode has names for the script. |
static UScript.ScriptUsage! |
Returns the script usage according to UAX #31 Unicode Identifier and Pattern Syntax. |
static Boolean |
Do the Script_Extensions of code point c contain script sc? If c does not have explicit Script_Extensions, then this tests whether c has the Script property value sc. |
static Boolean |
Returns true if in modern (or most recent) usage of the script case distinctions are customary. |
static Boolean |
isRightToLeft(script: Int) Returns true if the script is written right-to-left. |
Constants
ANATOLIAN_HIEROGLYPHS
static val ANATOLIAN_HIEROGLYPHS: Int
ISO 15924 script code
Value: 156
CANADIAN_ABORIGINAL
static val CANADIAN_ABORIGINAL: Int
Unified Canadian Aboriginal Symbols
Value: 40
CAUCASIAN_ALBANIAN
static val CAUCASIAN_ALBANIAN: Int
ISO 15924 script code
Value: 159
DEMOTIC_EGYPTIAN
static val DEMOTIC_EGYPTIAN: Int
ISO 15924 script code
Value: 69
EGYPTIAN_HIEROGLYPHS
static val EGYPTIAN_HIEROGLYPHS: Int
ISO 15924 script code
Value: 71
ESTRANGELO_SYRIAC
static val ESTRANGELO_SYRIAC: Int
ISO 15924 script code
Value: 95
HAN_WITH_BOPOMOFO
static val HAN_WITH_BOPOMOFO: Int
ISO 15924 script code
Value: 172
HIERATIC_EGYPTIAN
static val HIERATIC_EGYPTIAN: Int
ISO 15924 script code
Value: 70
IMPERIAL_ARAMAIC
static val IMPERIAL_ARAMAIC: Int
ISO 15924 script code
Value: 116
INSCRIPTIONAL_PAHLAVI
static val INSCRIPTIONAL_PAHLAVI: Int
ISO 15924 script code
Value: 122
INSCRIPTIONAL_PARTHIAN
static val INSCRIPTIONAL_PARTHIAN: Int
ISO 15924 script code
Value: 125
KATAKANA_OR_HIRAGANA
static val KATAKANA_OR_HIRAGANA: Int
Script in Unicode 4.0.1
Value: 54
MATHEMATICAL_NOTATION
static val MATHEMATICAL_NOTATION: Int
ISO 15924 script code
Value: 128
MAYAN_HIEROGLYPHS
static val MAYAN_HIEROGLYPHS: Int
ISO 15924 script code
Value: 85
MEROITIC_CURSIVE
static val MEROITIC_CURSIVE: Int
ISO 15924 script code
Value: 141
MEROITIC_HIEROGLYPHS
static val MEROITIC_HIEROGLYPHS: Int
ISO 15924 script code
Value: 86
OLD_CHURCH_SLAVONIC_CYRILLIC
static val OLD_CHURCH_SLAVONIC_CYRILLIC: Int
ISO 15924 script code
Value: 68
OLD_NORTH_ARABIAN
static val OLD_NORTH_ARABIAN: Int
ISO 15924 script code
Value: 142
OLD_SOUTH_ARABIAN
static val OLD_SOUTH_ARABIAN: Int
ISO 15924 script code
Value: 133
PHONETIC_POLLARD
static val PHONETIC_POLLARD: Int
ISO 15924 script code
Value: 92
PSALTER_PAHLAVI
static val PSALTER_PAHLAVI: Int
ISO 15924 script code
Value: 123
SIGN_WRITING
static val SIGN_WRITING: Int
ISO 15924 script code for Sutton SignWriting
Value: 112
TRADITIONAL_HAN
static val TRADITIONAL_HAN: Int
ISO 15924 script code
Value: 74
UCAS
static val UCAS: Int
Unified Canadian Aboriginal Symbols (alias)
Value: 40
UNWRITTEN_LANGUAGES
static val UNWRITTEN_LANGUAGES: Int
ISO 15924 script code
Value: 102
VISIBLE_SPEECH
static val VISIBLE_SPEECH: Int
ISO 15924 script code
Value: 100
ZANABAZAR_SQUARE
static val ZANABAZAR_SQUARE: Int
ISO 15924 script code
Value: 177
Public methods
breaksBetweenLetters
static fun breaksBetweenLetters(script: Int): Boolean
Returns true if the script allows line breaks between letters (excluding hyphenation). Such a script typically requires dictionary-based line breaking. For example, Hani and Thai.
Parameters | |
---|---|
script |
Int: script code |
Return | |
---|---|
Boolean |
true if the script allows line breaks between letters |
getCode
static fun getCode(locale: Locale!): IntArray!
Gets a script codes associated with the given locale or ISO 15924 abbreviation or name. Returns MALAYAM given "Malayam" OR "Mlym". Returns LATIN given "en" OR "en_US"
Parameters | |
---|---|
locale |
Locale!: Locale |
Return | |
---|---|
IntArray! |
The script codes array. null if the the code cannot be found. |
getCode
static fun getCode(locale: ULocale!): IntArray!
Gets a script codes associated with the given locale or ISO 15924 abbreviation or name. Returns MALAYAM given "Malayam" OR "Mlym". Returns LATIN given "en" OR "en_US"
Parameters | |
---|---|
locale |
ULocale!: ULocale |
Return | |
---|---|
IntArray! |
The script codes array. null if the the code cannot be found. |
getCode
static fun getCode(nameOrAbbrOrLocale: String!): IntArray!
Gets the script codes associated with the given locale or ISO 15924 abbreviation or name. Returns MALAYAM given "Malayam" OR "Mlym". Returns LATIN given "en" OR "en_US"
Note: To search by short or long script alias only, use getCodeFromName(java.lang.String)
instead. That does a fast lookup with no access of the locale data.
Parameters | |
---|---|
nameOrAbbrOrLocale |
String!: name of the script or ISO 15924 code or locale |
Return | |
---|---|
IntArray! |
The script codes array. null if the the code cannot be found. |
getCodeFromName
static fun getCodeFromName(nameOrAbbr: String!): Int
Returns the script code associated with the given Unicode script property alias (name or abbreviation). Short aliases are ISO 15924 script codes. Returns MALAYAM given "Malayam" OR "Mlym".
Parameters | |
---|---|
nameOrAbbr |
String!: name of the script or ISO 15924 code |
Return | |
---|---|
Int |
The script code value, or INVALID_CODE if the code cannot be found. |
getName
static fun getName(scriptCode: Int): String!
Returns the long Unicode script name, if there is one. Otherwise returns the 4-letter ISO 15924 script code. Returns "Malayam" given MALAYALAM.
Parameters | |
---|---|
scriptCode |
Int: int script code |
Return | |
---|---|
String! |
long script name as given in PropertyValueAliases.txt, or the 4-letter code |
Exceptions | |
---|---|
java.lang.IllegalArgumentException |
if the script code is not valid |
getSampleString
static fun getSampleString(script: Int): String!
Returns the script sample character string. This string normally consists of one code point but might be longer. The string is empty if the script is not encoded.
Parameters | |
---|---|
script |
Int: script code |
Return | |
---|---|
String! |
the sample character string |
getScript
static fun getScript(codepoint: Int): Int
Gets the script code associated with the given codepoint. Returns UScript.MALAYAM given 0x0D02
Parameters | |
---|---|
codepoint |
Int: UChar32 codepoint |
Return | |
---|---|
Int |
The script code |
getScriptExtensions
static fun getScriptExtensions(
c: Int,
set: BitSet!
): Int
Sets code point c's Script_Extensions as script code integers into the output BitSet.
- If c does have Script_Extensions, then the return value is the negative number of Script_Extensions codes (= -set.cardinality()); in this case, the Script property value (normally Common or Inherited) is not included in the set.
- If c does not have Script_Extensions, then the one Script code is put into the set and also returned.
- If c is not a valid code point, then the one
UNKNOWN
code is put into the set and also returned.
Some characters are commonly used in multiple scripts. For more information, see UAX #24: http://www.unicode.org/reports/tr24/.
Parameters | |
---|---|
c |
Int: code point |
set |
BitSet!: set of script code integers; will be cleared, then bits are set corresponding to c's Script_Extensions |
Return | |
---|---|
Int |
negative number of script codes in c's Script_Extensions, or the non-negative single Script value |
getShortName
static fun getShortName(scriptCode: Int): String!
Returns the 4-letter ISO 15924 script code, which is the same as the short Unicode script name if Unicode has names for the script. Returns "Mlym" given MALAYALAM.
Parameters | |
---|---|
scriptCode |
Int: int script code |
Return | |
---|---|
String! |
short script name (4-letter code) |
Exceptions | |
---|---|
java.lang.IllegalArgumentException |
if the script code is not valid |
getUsage
static fun getUsage(script: Int): UScript.ScriptUsage!
Returns the script usage according to UAX #31 Unicode Identifier and Pattern Syntax. Returns ScriptUsage#NOT_ENCODED
if the script is not encoded in Unicode.
Parameters | |
---|---|
script |
Int: script code |
Return | |
---|---|
UScript.ScriptUsage! |
script usage |
See Also
hasScript
static fun hasScript(
c: Int,
sc: Int
): Boolean
Do the Script_Extensions of code point c contain script sc? If c does not have explicit Script_Extensions, then this tests whether c has the Script property value sc.
Some characters are commonly used in multiple scripts. For more information, see UAX #24: http://www.unicode.org/reports/tr24/.
Parameters | |
---|---|
c |
Int: code point |
sc |
Int: script code |
Return | |
---|---|
Boolean |
true if sc is in Script_Extensions(c) |
isCased
static fun isCased(script: Int): Boolean
Returns true if in modern (or most recent) usage of the script case distinctions are customary. For example, Latn and Cyrl.
Parameters | |
---|---|
script |
Int: script code |
Return | |
---|---|
Boolean |
true if the script is cased |
isRightToLeft
static fun isRightToLeft(script: Int): Boolean
Returns true if the script is written right-to-left. For example, Arab and Hebr.
Parameters | |
---|---|
script |
Int: script code |
Return | |
---|---|
Boolean |
true if the script is right-to-left |